A-sync Speed over Gigabit LAN Ethernet File Transfer - file

I have a strange problem with transferring files between 2 computers connected to each other through an ethernet cable. Both PCs have on-board gigabit ethernet ports. Aside from the different hardwares, the softwares (especially network settings) are configured almost the same, with Windows 7 x64 etc. Tests have been taken with and without antivirus programs running with no difference. Duplex settings are auto negotiation. Jumbo packets (~9MB) are enabled (usually I'm transferring really large files). Hard drives are not a problem, since local transfer speed within a computer is around 100 MB/s.
Now if I am on PC1, and accessing shared files on PC2: Transferring files from PC1 to PC2 is very fast, usually in the range of 60 MB/s (see results below from LAN SpeedTest). But the opposite (transferring from PC2 to PC1) is really slow, about 10 MB/s.
Speed Test 1
If I am on PC2, and accessing PC1: Transferring files from PC2 to PC1 is slow (see speed test below - it's actually a little slower than when I'm transferring files and reading the speed report from Windows), while the opposite is fast (also about 60 MB/s like in the first case)
(I would post link 2 here but it does not allow me to since I am new)
So what causes this?
TIA

It might have to do with the hard drive write speed of PC1. The read speed is fast enough to send data to PC2, but the write speed is not fast enough to receive the data from PC2.

This may be a silly question but when you say the computers are connected via an Ethernet cable do you mean directly connected or via a switch? If directly connected, are you using a crossover Ethernet cable or straight-through cable? If using a straight-through cable, I would say that one or both of the cards are failing to auto-sense if they have that capability. Also auto-duplex and auto-speed settings have been known to fail in the past, try setting them manually as well.

Related

How do I increase the speed of my USB cdc device?

I am upgrading the processor in an embedded system for work. This is all in C, with no OS. Part of that upgrade includes migrating the processor-PC communications interface from IEEE-488 to USB. I finally got the USB firmware written, and have been testing it. It was going great until I tried to push through lots of data only to discover my USB connection is slower than the old IEEE-488 connection. I have the USB device enumerating as a CDC device with a baud rate of 115200 bps, but it is clear that I am not even reaching that throughput, and I thought that number was a dummy value that is a holdover from RS232 days, but I might be wrong. I control every aspect of this from the front end on the PC to the firmware on the embedded system.
I am assuming my issue is how I write to the USB on the embedded system side. Right now my USB_Write function is run in free time, and is just a while loop that writes one char to the USB port until the write buffer is empty. Is there a more efficient way to do this?
One of my concerns that I have, is that in the old system we had a board in the system dedicated to communications. The CPU would just write data across a bus to this board, and it would handle communications, which means that the CPU didn't have to waste free time handling the actual communications, but could offload the communications to a "co processor" (not a CPU but functionally the same here). Even with this concern though I figured I should be getting faster speeds given that full speed USB is on the order of MB/s while IEEE-488 is on the order of kB/s.
In short is this more likely a fundamental system constraint or a software optimization issue?
I thought that number was a dummy value that is a holdover from RS232 days, but I might be wrong.
You are correct, the baud number is a dummy value. If you create a CDC/RS232 adapter you would use this to configure your RS232 hardware, in this case it means nothing.
Is there a more efficient way to do this?
Absolutely! You should be writing chunks of data the same size as your USB endpoint for maximum transfer speed. Depending on the device you are using your stream of single byte writes may be gathered into a single packet before sending but from my experience (and your results) this is unlikely.
Depending on your latency requirements you can stick in a circular buffer and only issue data from it to the USB_Write function when you have ENDPOINT_SZ number of byes. If this results in excessive latency or your interface is not always communicating you may want to implement Nagles algorithm.
One of my concerns that I have, is that in the old system we had a board in the system dedicated to communications.
The NXP part you mentioned in the comments is without a doubt fast enough to saturate a USB full speed connection.
In short is this more likely a fundamental system constraint or a software optimization issue?
I would consider this a software design issue rather than an optimisation one, but no, it is unlikely you are fundamentally stuck.
Do take care to figure out exactly what sort of USB connection you are using though, if you are using USB 1.1 you will be limited to 64KB/s, USB 2.0 full speed you will be limited to 512KB/s. If you require higher throughput you should migrate to using a separate bulk endpoint for the data transfer.
I would recommend reading through the USB made simple site to get a good overview of the various USB speeds and their capabilities.
One final issue, vendor CDC libraries are not always the best and implementations of the CDC standard can vary. You can theoretically get more data through a CDC endpoint by using larger endpoints, I have seen this bring host side drivers to their knees though - if you go this route create a custom driver using bulk endpoints.
Try testing your device on multiple systems, you may find you get quite different results between windows and linux. This will help to point the finger at the host end.
And finally, make sure you are doing big buffered reads on the host side, USB will stop transferring data once the host side buffers are full.

Delay measurement and synchronisation between raspberry pi

I am doing a project with 2 raspberry pi which work as servers and a laptop which is the client.
I have attached to each raspberry and usb microphone and using the Portaudio Library im capturing audio streaming
and send it back to the laptop through a tcp/ip connection.
The scope of this project is to locate sound sources and it works like this. I run a .c file on each raspberry which are
connected on the same LAN as with the PC laptop. When this program is running on both raspberryies i have a message
"Waiting connection for a client". The next thing to do is just to run the matlab file which will start the both raspberries
and record. I have managed to synchronize the raspberries to start in the same time through a simple condition like
do
{
sleep(0.01);
j = read(newsockfd, &start,1 );
} while (j==0);
so right before both raspberries have to start recording i pause them in order to finish the initialization commands and so on
and then i just send a character "start = 'k'" through my matlab program
t1,t2 are tcp connections
start = 'k';
fwrite (t1, k);
fwrite (t2, k);
from this point both raspberries open the PortAudio stream and call recordCallBack function.
When I run the application and clap, i still get a delay of 0.2s between them which causes
an error of 60 meters. I have also checked the execution time of the fwrite function but that might
save me about 0.05 seconds which will still lead to results far from reality.
This project is based on TDOA measurement and it is desired to have a delay under 0.01 seconds to get accuracy <1m.
I have heard that linux has some very accurate timers, and i was thinking that maybe i could use that to
clock the time inside the functions in the .c file. Anyway if you have any ideas of how i can measure the delay from
the point i send the character 'k' from matlab until the point where the audio stream is opened in microphone, or any
way how i could synchronize the 2 linux servers please help.
ps: both are raspberry 2 pi and connected through UTP cables so the processing and transmission rates should be the same
It looks like an interesting project but I think you underestimate the problem a little bit. The first issue is that you need to synchonize the two sensors. Given the speed of sound and if you want an accuracy of about 1m you need to synchronize them with about 1ms accuracy. You could try with the Network Time Protocol but I'm not sure you can reach this accuracy even with a master on the local network. Better synchronization can be achieved with PTP (over ethernet) or GPS if you can receive a GPS signal.
Then if you manage to achieve this, a first step could be to record a few hand claps on both raspberry pi, save the timestamp when you start recording on both and see if you actually obtain something significant. Maybe you will also need to use a microcontroller and a real-time operating system instead!
There are many ways to synchronise clocks. It could be in a system level or in application level.
System level tend to be easier because there are already tools to do the job. I don't recommend you doing PTP at this stage, as mentioned by Emilien, since it is quite complicated to make it work. Instead I would recommend you to use normal setup via the same NTP network on all machines.
Example of NTP setup:
Query the server with # ntpdate -q 0.rhel.pool.ntp.org
If it is running, setup your local clock with # ntpdate 0.rhel.pool.ntp.org 1.rhel.pool.ntp.org
OBS: # means root user (which most likely means that you will need to run the command with sudo), whilst $ means normal user.
Check all machines times with $ date +%k:%M:%S.%N which will return the clock down to a nanosecond resolution.
If that doesn't acheive the desired result then try the PTP aproach, or just synchronise all your devices when they connect to the master, where your master can normalise each independant clock. I will not go into details here.
Then you can send your audio data via TCP/IP (or perhaps UDP/IP to lower latency) like you mentioned before, but always send the timestamp of your slave machine associated to a audio frame using clock_gettime() function with CLOCK_REALTIME as the clk_id argument.

Waiting for all USB drives to load

I'm currently working with an application that has branching logic depending on whether a specific USB drive is inserted into the system. It does this by polling all drive letters looking for a path on the root of each drive.
This works for the majority of machines, but often, this application runs on startup with the USB drive inserted. In addition, some machines are especially slow and take a good minute to load the USB drive once Windows boots. In these machines, the code reaches the check of whether this drive exists, it can't find the drive, and the wrong branch is executed.
It could be possible to wait a minute before checking for drives. I'd prefer to have the application wait for all USB devices (or simply only mass storage devices) to load before checking for the drives, or something even more intelligent.
Unfortunately, I am unfamiliar with the methods needed to wait for all USB devices to finish loading, and with the DDK in general. I can see that it's possible to register a window for device notifications with GUID_DEVINTERFACE_USB_DEVICE, possibly receiving messages like DBT_DEVICEARRIVAL, DBT_DEVNODES_CHANGED, and WM_DEVICECHANGE, the distinctions of which I do not know.
However, the USB drive will likely be already inserted and detected (but not having a drive letter) into the system before the program executes. So, registering for all device change notifications would not make sense. If it's possible to identify devices which are inserted but not loaded (possibly with SetupDiEnumDeviceInterfaces) and then register for "loaded" notifications on all of those devices, it might work. I don't have familiarity with any this, so pointers (or sample code) would be extremely helpful.
I don't think you can.
The problem is that you're tyring to distinguish two cases which are not actually distinct, that of plugging in a USB device during boot and plugging it in later.
You have to know that USB is a protocol which requires a non-trivial amount of intelligence in the USB slave devices. They exchange multiple messages with the USB host (i.e. your OS). This exchange isn't instant. For instance, your USB harddrive will need to ask for permission to draw more than 100mA power. To answer that, the power drivers of Windows must be up and running. The phsyical disk can only spin up when the answer arrives.
So, there's a whole message sequence going on, and the drive letter shows up only fairly late. Windows must know how many partitions exist. So during this exchange, new devices are being created all the time.
When you enumerate devices while devices are actively being added, you're really asking for troubles. The SetupDiEnumDeviceInterfaces API doesn't operate on a snapshot (which we know because there's no Close method); you're asking for the N'th device until you get an "no more devices" error an you know N was to big. But when devices are still actively being added, N changes. And I don't see a guarantee that the list order is by age; devices might be added in the middle as well.
I don't think that getting notifications about drivers being installed for newly plugged-in devices would help you much. When you plug in the same device repeatedly, the drivers are usually installed only the first time.
Also, an USB flash drive, although physically looks like one compact device, is represented by at least three PnP devices by Windows: an USB mass storage device (represents the USB endpoint), a disk device (represents the physical disk inside the flash drive) and one or multiple Volume devices (each represents one volume = partition in your case). Drive letters may be assigned to the Volume devices.
What you can possibly do is monitoring the arrival and removal of Volume devices (RegisterDeviceNotification for GUID_DEVINTERFACE_VOLUME) and examining each volume device that arrives (I believe that Setup API allows you to track its "parents" to the USB stack).

Possible causes for lack of data loss over my localhost UDP protocol?

I just implemented my first UDP server/client. The server is on localhost.
I'm sending 64kb of data from client to server, which the server is supposed to send back. Then, the client checks how many of the 64kb are still intact and they all are. Always.
What are the possible causes for this behaviour? I was expecting at least -some- dataloss.
client code: http://pastebin.com/5HLkfcqS
server code: http://pastebin.com/YrhfJAGb
PS: A newbie in network programming here, so please don't be too harsh. I couldn't find an answer for my problem.
The reason why you are not seeing any lost datagrams is that your network stack is simply not running into any trouble.
Your localhost connection can easily cope with what you provide, a localhost connection is able to process several 100 megabyte of data per second on a decent CPU.
To see dropped datagrams you should increase the probability of interference. You have several opportunities:
increase the load on the network
busy your cpu with other tasks
use a "real" network and transfer data between real machines
run your code over a dsl line
set up a virtual machine and simulate network outages (Vmware Workstation is able to do so)
And this might be an interesting read: What would cause UDP packets to be dropped when being sent to localhost?

Maximizing performance on udp

im working on a project with two clients ,one for sending, and the other one for receiving udp datagrams, between 2 machines wired directly to each other.
each datagram is 1024byte in size, and it is sent using winsock(blocking).
they are both running on a very fast machines(separate). with 16gb ram and 8 cpu's, with raid 0 drives.
im looking for tips to maximize my throughput , tips should be at winsock level, but if u have some other tips, it would be great also.
currently im getting 250-400mbit transfer speed. im looking for more.
thanks.
Since I don't know what else besides sending and receiving that your applications do it's difficult to know what else might be limiting it, but here's a few things to try. I'm assuming that you're using IPv4, and I'm not a Windows programmer.
Maximize the packet size that you are sending when you are using a reliable connection. For 100 mbs Ethernet the maximum packet is 1518, Ethernet uses 18 of that, IPv4 uses 20-64 (usually 20, thought), and UDP uses 8 bytes. That means that typically you should be able to send 1472 bytes of UDP payload per packet.
If you are using gigabit Ethernet equiptment that supports it your packet size increases to 9000 bytes (jumbo frames), so sending something closer to that size should speed things up.
If you are sending any acknowledgments from your listener to your sender then try to make sure that they are sent rarely and can acknowledge more than just one packet at a time. Try to keep the listener from having to say much, and try to keep the sender from having to wait on the listener for permission to keep sending.
On the computer that the sender application lives on consider setting up a static ARP entry for the computer that the receiver lives on. Without this every few seconds there may be a pause while a new ARP request is made to make sure that the ARP cache is up to date. Some ARP implementations may do this request well before the ARP entry expires, which would decrease the impact, but some do not.
Turn off as many users of the network as possible. If you are using an Ethernet switch then you should concentrate on the things that will introduce traffic to/from the computers/network devices on which your applications are running reside/use (this includes broadcast messages, like many ARP requests). If it's a hub then you may want to quiet down the entire network. Windows tends to send out a constant stream of junk to networks which in many cases isn't useful.
There may be limits set on how much of the network bandwidth that one application or user can have. Or there may be limits on how much network bandwidth the OS will let it self use. These can probably be changed in the registry if they exist.
It is not uncommon for network interface chips to not actually support the maximum bandwidth of the network all the time. There are chips which may miss packets because they are busy handling a previous packet as well as some which just can't send packets as close together as Ethernet specifications would allow. Additionally the rest of the system might not be able to keep up even if it is.
Some things to look at:
Connected UDP sockets (some info) shortcut several operations in the kernel, so are faster (see Stevens UnP book for details).
Socket send and receive buffers - play with SO_SNDBUF and SO_RCVBUF socket options to balance out spikes and packet drop
See if you can bump up link MTU and use jumbo frames.
use 1Gbps network and upgrade your network hardware...
Test the packet limit of your hardware with an already proven piece of code such as iperf:
http://www.noc.ucf.edu/Tools/Iperf/
I'm linking a Windows build, it might be a good idea to boot off a Linux LiveCD and try a Linux build for comparison of IP stacks.
More likely your NIC isn't performing well, try an Intel Gigabit Server Adapter:
http://www.intel.com/network/connectivity/products/server_adapters.htm
For TCP connections it has been shown that using multiple parallel connections will better utilize the data connection. I'm not sure if that applies to UDP, but it might help with some of the latency issues of packet processing.
So you might want to try multiple threads of blocking calls.
As well as Nikolai's suggestion of send and recv buffers, if you can, switch to overlapped I/O and have many recvs pending, this also helps to minimise the number of datagrams that are dropped by the stack due to lack of buffer space.
If you're looking for reliable data transfer, consider UDT.

Resources