How does an XBee coordinator handle simultaneous data from multiple nodes? - xbee

How can I make multiple nodes communicate to a coordinator without loss of data?
When more number of XBee nodes send their data simultaneously to the same XBee coordinator won't there be problems of congestion? To my knowledge, it's an yes.
In such case, how can I avoid this congestion? Also, I want the system to work in real time. So there should not be any delay.
I came across Stack Overflow question XBee - XBee-API and multiple endpoints. I deal with a similar problem.
How was this was solved?

As you add devices on a network, the only way to avoid congestion is to transmit less frequently.
If you look at the XBee documentation, most of the modules have a "Transmit Status" frame that the host receives once the message has been successfully delivered (or abandoned due to errors). I believe the success response is triggered by a MAC-level ACK on the network.
If you have smart hosts on your nodes, they can adjust their transmit frequency by waiting for an ACK before sending their next frame, and maybe even using the retries counter in the Transmit Status frame to set a delay before sending.
While the 802.15.4 protocol sends data at 250 kbit/s, the overhead of headers, relaying of messages across a mesh network, and dealing with collisions brings that down to around 100 kbit/s of useable bandwidth. Try to maximize the payload from your devices, to increase the data-to-headers ratio. Sending five pieces of data in a single frame every five seconds is better than one piece in a frame every second.
How much data do you need to send, and what is your definition of "real time"? Is a 10 ms delay acceptable? How about 100 ms? 500 ms? How many devices will try to send at the same time? How often will they send?
All of those questions will figure into your design, and you may find that 802.15.4 isn't suited for what you need to do.

I have setup 15nodes of Series 2 XBee.Node will have multiple sensor like Light,motion etc. XBee placed on Fio board and send data for every 3 min.
Node will be in AT mode and Co-ordinator in API mode.Nodes go through few Router XBee ( AT mode). Co-ordinator collects data (Connected to R'pi) and upload data to server.
It is not mesh network and Xbee will not sleep.
So,i did not encounter any congestion issue at this level.
Hope this helps.

Related

XBee communication in large networks

Here is my situation:
I have a network of 96 XBee S2B and S2C modules. My application runs on an ARM module and has an XBee S2C module. All modules (in total 97 of them) are in the same network and are able to communicate with each other.
The software starts and knows the 64 bit addresses of all modules. It will do a network discovery (Local AT -> ND) and will wait for the responses. With each response the 16 bit address of each module is updated. If a module has not responded to the network discovery, it will be sent again every 30 seconds (in most tests, after 60 seconds, all nodes are discovered.
Then, with all 64 bit and 16 bit addresses stored, the application will send a message to every node using unicasting. It will not wait between sending the messages. I tried this with, 36, 42, 78 and now 96 nodes. With 36 nodes the messages are received within 3 seconds by every node (as expected), with 42 and 78 it takes respectively 4 and 7 seconds to reach every node. With 96 however it takes 90 seconds (at the least).
There is no outside interference that I can detect and all nodes are within reach (if not, the network discovery would have failed).
I also tried using 64 bit messaging and ignoring the 16 bit address, it takes even longer when using this method.
I am using the xbee3library made by attie (https://github.com/attie/libxbee3).
My question is: How do I speed up the communication time of the 96 nodes (keep in mind that the goal is to be able to handle even bigger networks) and why is there such a big difference between 78 and 96 nodes (why is the network suddenly so slow?)
If there is any more information needed about my situation, I will be happy to provide it. As I manage the code I can perform tests if you need more information.
First off, get an 802.15.4 sniffer and start looking at the traffic to see what's going on. Without that, you're stuck guessing at what might be happening. I haven't worked with 802.15.4 in years, but outside of Ember Desktop (only available from Silicon Labs in expensive development kits) I was pleased with the Ubiqua Protocol Analyzer. You might also want to explore where Wireshark's 802.15.4 sniffing capabilities stand.
Second, try implementing code to wait for a Transmit Status message before sending your next message. Better yet, write code to keep track of multiple outstanding messages and test it out with various settings -- how does the network behave with 1 message waiting on a Transmit Status, versus 5 outstanding messages?
My guess is that you're running into challenges with the XBee modules managing a routing table for that many nodes. Digi provides a document for working with large XBee networks, which explains how to use Source Routing on a large network. Your central node may need to maintain a routing table and specify routes in outbound messages to improve network throughput.
The thing is there is a lot of collision and large overhead on your network in the scenario where 96 nodes are involved.
My suggestion is to cluster your nodes with multiple routers as your network growth.
The issue is likely to be you are using the stadard zigbee based routing which is AODV which is basically calculated with every transmission. Once the number of nodes reaches a large number the calculation takes exponentially longer. You should consider changing to Source Routing which is basically a different frame type that also makes use of stored routes at nodes. On a large stable network this should be much faster for transmission of messages.
https://www.digi.com/wiki/developer/index.php/Large_ZigBee_Networks_and_Source_Routing

Any benefit in sockets from multiple cores? (Linux)

Cant find myself the answer for such a question:
Is there any benefit/boost to sockets in general at multi-core machine. I mean is there maybe some kind of sharing access to packets queue incoming to kernel from the ethernet-card driver or smth.
I understand that when it comes up to API call there can be multiple threads working with one socket instance, but it is up to programmer to synchronize and play correctly with calls to read/write/close/select etc. So at that level i see benefit only in working with dispatched packets and post processing etc... Or there is no speed boost until the packet copied during system call and transferred to user space?
The benefit of multi-core depends on how concurrency can be achieved in your algorithm. Take ethernet receiving as example, there are 4 tasks involved in it.
On receiving packet, the NIC hardware trigger an interrupt and CPU handle interrupt in interrupt context.
The network stack handle RX packet in software irq mechanism. The software irq requests can be run concurrently on multiple CPU. In network stack RX function, it pass network buffer to socket and wakes up user thread pendning at the socket.
The user thread waked up and continue user application code to receive or processes received network packets.
1) can only run at one CPU, 2) can run in multiple CPU and 3) can also runs on multiple CPU for multi-process or multi-thread applications.
Following from #Greg Inozemtsev's comment. NICs with multiple receive queues can filter incoming traffic into different queues. Each queue can generate a unique interrupt to the CPU to signal incoming packets. Each queue can be assigned a different interrupt that can be dispatched to a unique CPU core.
Linux supports various techniques in the Kernel such as:
RSS: Receive Side Scaling
RPS: Receive Packet Steering
RFS: Receive Flow Steering
Accelerated Receive Flow Steering
XPS: Transmit Packet Steering
Lets say all the packets are destined to your machine IP on port 80 and you have used socket() and listen() to create a socket on port 80. Traffic is coming into your NIC from a variety of source IPs so its being hashed into multiple receive queues (meaning the hardware interrupts are being spread across multiple CPU cores thanks to RSS). You can use the native kernel socket option PACKET_FANOUT to spread the load across multiple worker threads within your application.
If you look to 3rd party libraries NetMap, DPDK and VPP, just as examples, these can all be used to scale up even further by take you down to zero copy RX/TX with the caveat you need to write some of the network protocol code your self depending on which 3rd party library you use.
There are many many things to consider here, far too much to cover in one SO question. In answer to your original question; yes. Although I have tried to provide a bit of extra information.
For reading on RSS/RPS/RFS etc:
https://blog.cloudflare.com/how-to-receive-a-million-packets/
For further reading related to native PACKET_FANOUT and PACKET_MMAP:
http://kukuruku.co/hub/nix/capturing-packets-in-linux-at-a-speed-of-millions-of-packets-per-second-without-using-third-party-libraries
http://yusufonlinux.blogspot.co.uk/2010/11/data-link-access-and-zero-copy.html?m=1
https://www.kernel.org/doc/Documentation/networking/packet_mmap.txt
Further reading related to NetMap as an example of 3rd party library
https://blog.cloudflare.com/single-rx-queue-kernel-bypass-with-netmap/
Further reading on NUMA and affinity:
https://null.53bits.co.uk/index.php?page=numa-and-queue-affinity
https://blog.cloudflare.com/how-to-achieve-low-latency/

Speeding up UDP-based file transfer with loss protection?

I'm trying to learn UDP, and make a simple file transferring server and client.
I know TCP would potentially be better, because it has some reliability built in. However I would like to implement some basic reliability code myself.
I've decided to try and identify when packets are lost, and resend them.
What I've implemented is a system where the server will send the client a certain file in 10 byte chunks. After it sends each chunk, it waits for an acknowledgement. If it doesn't receive one in a few seconds time, it sends the chunk again.
My question is how can a file transfer like this be done quickly? If you send a file, and lets say theirs 25% chance a packet could be lost, then there will be a lot of time built up waiting for the ACK.
Is there some way around this? Or is it accepted that with high packet loss, it will take a very long time? Whats an accepted time-out value for the acknowledgement?
Thanks!
There are many questions in your post, I will try to address some. The main thing is to benchmark and find the bottleneck. What is the slowest operation?
I can tell you now that the bottleneck in your approach is waiting for an ACK after each chunk. Instead of acknowledging chunks, you want to acknowledge sequences. The second biggest problem is the ridiculously small chunk. At that size there's more overhead than actual data (look up the header sizes for IP and UDP).
In conclusion:
What I've implemented is a system where the server will send the
client a certain file in 10 byte chunks.
You might want to try a few hundred bytes chunks.
After it sends each chunk, it waits for an acknowledgement.
Send more chunks before requiring an acknowledgement, and label them. There is more than one way:
Instead of acknowledging chunks, acknowledge data: "I've received
5000 bytes" (TCP, traditional)
Acknowledge multiple chunks in one message. "I've received chunks 1, 5, 7, 9" (TCP with SACK)
What you've implemented is Stop-and-wait ARQ. In a high-latency network, it will inevitably be slower than some other more complex options, because it waits for a full cycle on each transmission.
For other possibilities, see Sliding Window and follow links to other variants. What you've got is basically a degenerate form of sliding window with window-size 1.
As other answers have noted, this will involve adding sequence numbers to your packets, sending additional packets while waiting for acknowledgement, and retransmitting on a more complex pattern.
If you do this, you are essentially reinventing TCP, which uses these tactics to supply a reliable connection.
You want some kind of packet numbering, so that the client can detect a lost packet by the missing number in the sequence of received packets. Then the client can request a resend of the packets it knows it is missing.
Example:
Server sends packet 1,2,3,4,5 to client. Client receives 1,4,5, so it knows 2 and 3 were lost. So client acks 1,4 and 5 and requests resend of 2 and 3.
Then you still need to work out how to handle acks / requests for resends, etc. In any case, assigning a sequence of consecutive numbers to the packets so that packet loss can be detected by "gaps" in the sequence is a decent approach to this problem.
Your question exactly describes one of the problems that TCP tries to answer. TCP's answer is particularly elegant and parsimonious, imo, so reading an English-language description of TCP might reward you.
Just to give you a ballpark idea of UDP in the real world: SNMP is a network-management protocol that is meant to operate over UDP. SNMP requests (around 1500 payload bytes) sent by a manager to a managed node are never explicitly acknowledged and it works pretty well. Twenty-five percent packet loss is a huge number -- real-life packet loss is an order of magnitude somaller, at worst -- and, in that broken environment, SNMP would hardly work at all. Certainly a human being operating the network management system -- the NMS -- would be on the phone to network hardware support very quickly.
When we use SNMP, we generally understand that a good value for timeout is three or four seconds, meaning that the SNMP agent in the managed network node will probably have completed its work in that time.
HTH
Have a look at the TFTP protocol. It is a UDP-based file transfer protocol with built-in ack/resend provisions.

Custom RS485 Protocols

I am writing a simple multi-drop RS485 protocol for serial communications within a distributed system. I am using an addressable model where slave devices are given a window of 20ms to respond. The master uC polls the connected devices for updates and they respond accordingly. I've employed checksums and take the necessary overrun precautions to ensure that connected devices will not respond to malformed messages. This method has proved effective in approximately 99% of situations, but I lose the packet if a new device is introduced during a communication session. Plugging in a new device "hot" will have negative effects on the signal being monitored by the slave devices, if only for an extremely short time. I'm on the software side of engineering, but how I can mitigate this situation without trying to recreate TCP? We use a polling model because it is fast and does the job well for our application, no need for RTOS functionality. I have an abundance of cycles on each cpu, think in basic terms.
Sending packets over the RS485 is not a reliable communication. You will have to handle the lost of packets anyway. Of course, you won't have to reinvent TCP. But you will have to detect lost packets by means of timeout monitoring and sequence numbers. In simple applications this can be done at application level, what keeps you far off from the complexity of TCP. When your polling model discards all packets with invalid checksum this might be integrated with less effort.
If you want to check for collisions, that can be caused by hot plugs or misbehaving devices there are probably some improvements. Some hardware allows to read back the own transmissing. If you find a difference between sent data and receive data, you can assume a collision and repeat the packet. This will also require a kind of sequence numbering.
Perhaps I've missed something in your question, but can't you just write the master so that if a response isn't seen from a device within the allowed time, it re-polls that device?

Maximizing performance on udp

im working on a project with two clients ,one for sending, and the other one for receiving udp datagrams, between 2 machines wired directly to each other.
each datagram is 1024byte in size, and it is sent using winsock(blocking).
they are both running on a very fast machines(separate). with 16gb ram and 8 cpu's, with raid 0 drives.
im looking for tips to maximize my throughput , tips should be at winsock level, but if u have some other tips, it would be great also.
currently im getting 250-400mbit transfer speed. im looking for more.
thanks.
Since I don't know what else besides sending and receiving that your applications do it's difficult to know what else might be limiting it, but here's a few things to try. I'm assuming that you're using IPv4, and I'm not a Windows programmer.
Maximize the packet size that you are sending when you are using a reliable connection. For 100 mbs Ethernet the maximum packet is 1518, Ethernet uses 18 of that, IPv4 uses 20-64 (usually 20, thought), and UDP uses 8 bytes. That means that typically you should be able to send 1472 bytes of UDP payload per packet.
If you are using gigabit Ethernet equiptment that supports it your packet size increases to 9000 bytes (jumbo frames), so sending something closer to that size should speed things up.
If you are sending any acknowledgments from your listener to your sender then try to make sure that they are sent rarely and can acknowledge more than just one packet at a time. Try to keep the listener from having to say much, and try to keep the sender from having to wait on the listener for permission to keep sending.
On the computer that the sender application lives on consider setting up a static ARP entry for the computer that the receiver lives on. Without this every few seconds there may be a pause while a new ARP request is made to make sure that the ARP cache is up to date. Some ARP implementations may do this request well before the ARP entry expires, which would decrease the impact, but some do not.
Turn off as many users of the network as possible. If you are using an Ethernet switch then you should concentrate on the things that will introduce traffic to/from the computers/network devices on which your applications are running reside/use (this includes broadcast messages, like many ARP requests). If it's a hub then you may want to quiet down the entire network. Windows tends to send out a constant stream of junk to networks which in many cases isn't useful.
There may be limits set on how much of the network bandwidth that one application or user can have. Or there may be limits on how much network bandwidth the OS will let it self use. These can probably be changed in the registry if they exist.
It is not uncommon for network interface chips to not actually support the maximum bandwidth of the network all the time. There are chips which may miss packets because they are busy handling a previous packet as well as some which just can't send packets as close together as Ethernet specifications would allow. Additionally the rest of the system might not be able to keep up even if it is.
Some things to look at:
Connected UDP sockets (some info) shortcut several operations in the kernel, so are faster (see Stevens UnP book for details).
Socket send and receive buffers - play with SO_SNDBUF and SO_RCVBUF socket options to balance out spikes and packet drop
See if you can bump up link MTU and use jumbo frames.
use 1Gbps network and upgrade your network hardware...
Test the packet limit of your hardware with an already proven piece of code such as iperf:
http://www.noc.ucf.edu/Tools/Iperf/
I'm linking a Windows build, it might be a good idea to boot off a Linux LiveCD and try a Linux build for comparison of IP stacks.
More likely your NIC isn't performing well, try an Intel Gigabit Server Adapter:
http://www.intel.com/network/connectivity/products/server_adapters.htm
For TCP connections it has been shown that using multiple parallel connections will better utilize the data connection. I'm not sure if that applies to UDP, but it might help with some of the latency issues of packet processing.
So you might want to try multiple threads of blocking calls.
As well as Nikolai's suggestion of send and recv buffers, if you can, switch to overlapped I/O and have many recvs pending, this also helps to minimise the number of datagrams that are dropped by the stack due to lack of buffer space.
If you're looking for reliable data transfer, consider UDT.

Resources