Purpose of NTP vs. ICMP Timestamp message - icmp

I know the purpose of the Network Time Protocol is to synchronize clocks over networks, primarily with the use of the Originate, Receive and Transmit timestamps to make the time calculations.
But, the ICMP protocol also has a Timestamp control message (and a respective Timestamp Reply one), that "is used for time synchronization". It also contains three timestamp fields with the same name as in NTP, that are likely used in a similar manner.
So, what are the differences between the two? I guess the distinction is not that NTP is for desktop operating systems and ICMP for Layer 3 devices, since I know Cisco switches that are using NTP.

Timestamps may look like a similar field, but have different length and very different content
ICMP-timestamps' fields are 31-bit, carry relative-time of "touching" ICMP-packed on outgoing/incoming side of the network connection, expressed as a number of miliseconds elapsed since last UTC-midnight.
Highest order bit is used for flagging a UTC-uncoordinated host-time / non-standard value.
NTP-timestamps' fields are 64 bit, carry an absolute time, recorded as a 32-bit for seconds since epoch ( Jan-01-1900 .. till .. rollover in 2036 ) and another 32-bits for fractions of a second ( thus going in a time-measurement down to a sub-nanosecond resolution ).

Related

Measuring Elapsed Time In Linux (CLOCK_MONOTONIC vs. CLOCK_MONOTONIC_RAW)

I am currently trying to talk to a piece of hardware in userspace (underneath the hood, everything is using the spidev kernel driver, but that's a different story).
The hardware will tell me that a command has been completed by indicating so with a special value in a register, that I am reading from. The hardware also has a requirement to get back to me in a certain time, otherwise the command has failed. Different commands take different times.
As a result, I am implementing a way to set a timeout and then check for that timeout using clock_gettime(). In my "set" function, I take the current time and add the time interval I should wait for (usually this anywhere from a few ms to a couple of seconds). I then store this value for safe keeping later.
In my "check" function, I once again, get the current time and then compare it against the time I have saved. This seems to work as I had hoped.
Given my use case, should I be using CLOCK_MONOTONIC or CLOCK_MONOTONIC_RAW? I'm assuming CLOCK_MONOTONIC_RAW is better suited, since I have short intervals that I am checking. I am worried that such a short interval might represent a system-wide outlier, in which NTP was doing alot of adjusting. Note that my target system is only Linux kernels 4.4 and newer.
Thanks in advance for the help.
Edited to add: given my use case, I need "wall clock" time, not CPU time. That is, I am checking to see if the hardware has responded in some wall clock time interval.
References:
Rutgers Course Notes
What is the difference between CLOCK_MONOTONIC & CLOCK_MONOTONIC_RAW?
Elapsed Time in C Tutorial

Implementing Primary NTP Server (GPS Receiver)

I'm trying to implement an NTP server based on an NMEA GPS Receiver. I'm not sure what to fill the root delay field with.
I've read the NTPv4 specification and it's written that root delay is the total round-trip delay to the reference clock.
If I'm working with a secondary server, root delay can be calculated from the time difference between the timestamps when making the packet requests with the reference server (am I correct?).
But I'm not sure what to fill it with if I'm using a GPS Receiver as the reference clock, should I fill it with 0 instead?
It will depend largely on how you're setting the time in your server from the GPS. If you're reading the NMEA sentence, interpreting it and setting the clock, the root delay would be the time taken to do that. But it wouldn't be a very good clock; there's a lot of non-deterministic delays (jitter) involved in reading RS232 (assuming that is how you're connected to the GPS).
You can use the 1 pulse per second output of a GPS receiver to fix that. It's normally on the Data Carrier Detect pin. Using a proper RS232 port (not a USB one) you can have the server's clock synchronised to that (DCD can be used to raise an interrupt), so now you get very good alignment to GPS time. This could certainly be done in Solaris (a native part of the kernel), and in Linux too (http://support.ntp.org/bin/view/Support/ConfiguringNMEARefclocks). If you're doing this then I think that the root delay would be small, but there's the matter of the OS and hardware's response time to interrupts.
EDIT
According to this NTP docs page,
Root Delay
This is the total roundtrip delay to the primary reference source at
the root of the synchronization subnet, in seconds. Note that this
variable can take on both positive and negative values, depending on
clock Precision and Skew.
So with 1PPS it's going to be pretty low. So far as I can tell it's a field that a secondary NTP server uses to tell its clients what its delay to a reference clock is. So if you have a 1PPS locked GPS time source, you are a reference clock. In which case, perhaps zero is correct enough; I don't think that NTP can achieve cross-network time synchronisation accuracies (1ms at best) better than the IRQ response time of a computer (< 50us hopefully with a good CONFIG_PREEMPT_RT linux kernel with nothing else going on).

Nanosecond Precision Variable Timer for Kernel 4.X using CLOCK_REALTIME

I have been working on timers for a while but still unable to get a promising solution for my situation.
Basically, I want to send packets at specific time.
For example :
1st PACKET at 1486500720.000000000
-> wait -> nanosleep(1000000000)
2nd PACKET at 1486500721.000000000
-> wait -> nanosleep(1000000000)
3rd PACKET at 1486500722.000000000
-> wait -> nanosleep(1000000000)
4th PACKET at 1486500723.000000000
The time gap between them is exactly 1.000000000 second but when I send packet, each time it takes different time.
For example for 1st Packet, it takes 0.005025045 seconds to send it and then the nanosleep start.
So, my second packet is sent at 486500721.005025045 instead of 1486500721.000000000.
So everytime I have to adjust the nanosleep value by using clockgettime(CLOCK_REALTIME) by subtracting the time remaining with including offset of getime command overhead.
As I have to this in a loop with nanosecond precision (I know it is not possible but I want it to be as specific as possible), I use simple for loop.
My question is there any better way to do this with more precision ? I am on Kernel 4.4, so are you aware of any approach which is working for newer kernels or any other approach likely to be more precise than mine ?
You should use clock_nanosleep, which allows an absolute time to wait until, rather than nanosleep, which only uses relative times. However there's no way you're going to get the degree of precision you're asking for. Just returning to userspace after the clock expires is going to take at least a few hundred nanoseconds if not a thousand or more. And then there's unpredictable latency to send the packet after you wake, too.
At best you might be able to tune your wake time too be "close enough" for your purposes.

Receiving RAW socket packets with microseconds level accuracy

I am writing a code, which receives raw ethernet packets (no TCP/UDP) every 1ms from the server. For every packet received, my application has to reply with 14 raw packets. If the server doesn't receive the 14 packets before it sends it's packet scheduled for every 1ms, then the server raises an alarm and the application has to break out. The server-client communication is a one to one link.
The server is a hardware (FPGA) which generates packets at precise 1ms interval. The client application runs on a Linux (RHEL/Centos 7) machine with 10G SolarFlare NIC.
My first version of code is like this
while(1)
{
while(1)
{
numbytes = recvfrom(sockfd, buf, sizeof(buf), 0, NULL, NULL);
if(numbytes > 0)
{
//Some more lines here, to read packet number
break;
}
}
for (i=0;i<14;i++)
{
if (sendto(sockfd,(void *)(sym) , sizeof(sym), 0, NULL, NULL) < 0)
perror("Send failed\n");
}
}
I measure the receive time by taking timestamps (using clock_gettime) before the recvfrom call and one after it, I print the time differences of these timestamps and print them whenever the time difference exceeds allowable range of 900-1100 us.
The problem I am facing is that the packet receive time is fluctuating.Something like this (the prints are in microseconds)
Decode Time : 1234
Decode Time : 762
Decode Time : 1593
Decode Time : 406
Decode Time : 1703
Decode Time : 257
Decode Time : 1493
Decode Time : 514
and so on..
And sometimes the decode times exceed 2000us and application would break.
In this situation, application would break anywhere between 2 seconds to a few minutes.
Options tried by me till now.
Setting affinity to a particular isolated core.
Setting scheduling priorities to maximum with SCHED_FIFO
Increase socket buffer sizes
Setting network interface interrupt affinity to the same core which processes application
Spinning over recvfrom using poll(),select() calls.
All these options give a significant improvement over initial version of code. Now the application would run for ~1-2 hours. But this is still not enough.
A few observations:
I get a a huge dump of these decode time prints, whenever I take ssh sessions to Linux machine while the application is running (which makes me think network communication over other 1G Ethernet interface is creating interference with the 10G Ethernet interface).
The application performs better in RHEL (run times of about 2-3 hours) than Centos (run times of about 30 mins - 1.5 hours)
The run times is also varying with Linux machines with different hardware configurations with same OS.
Please suggest if there are any other methods to improve the run-time of the application.
Thanks in advance.
First, you need to verify the accuracy of the timestamping method; clock_gettime. The resolution is nanoseconds, but the accuracy and precision is in question. That is not the answer to your problem, but informs on how reliable the timestamping is before proceeding. See Difference between CLOCK_REALTIME and CLOCK_MONOTONIC? for why CLOCK_MONOTONIC should be used for your application.
I suspect the majority of the decode time fluctuation is either due to a variable number of operations per decode, context switching of the operating system, or IRQs.
Operations per decode I cannot comment on since the code has been simplified in your post. This issue can also be profiled and inspected.
Context switching per process can be easily inspected and monitored https://unix.stackexchange.com/a/84345
As Ron stated, these are very strict timing requirements for a network. It must be an isolated network, and single purpose. Your observation regarding decode over-time when ssh'ing indicates all other traffic must be prevented. This is disturbing, given separate NICs. Thus I suspect IRQs are the issue. See /proc/interrupts.
To achieve consistent decode times over long intervals (hours->days) will require drastically simplifying the OS. Removing unnecessary processes and services, hardware, and perhaps building your own kernel. All for the goal of reducing context switching and interrupts. At which point a real-time OS should be considered. This will only improve the probability of consistent decode time, not guarantee.
My work is developing a data acquisition system that is a combination of FPGA ADC, PC, and ethernet. Inevitably, the inconsistency of a multi-purpose PC means certain features must be moved to dedicated hardware. Consider the Pros/Cons of developing your application for PC versus moving it to hardware.

How do I turn on nanosecond precision when capturing live traffic?

How do I tell libpcap v1.6.2 to store nanosecond values in struct pcap_pkthdr::ts.tv_usec (instead of microsecond values) when capturing live packets?
(Note: This question is similar to How to enable nanosecond resolution when capturing live packets in libpcap? but that question is vague enough that I decided to ask a new question.)
For offline and "dead" captures, the following functions can be used to tell libpcap to fill the struct pcap_pkthdr's ts.tv_usec member with nanosecond values:
pcap_open_dead_with_tstamp_precision()
pcap_open_offline_with_tstamp_precision()
pcap_fopen_offline_with_tstamp_precision()
Unfortunately, there does not appear to be _with_tstamp_precision variants for pcap_open_live() or pcap_create().
I believe that capturing live packets with nanosecond resolution should be possible, because the changelog for v1.5.0 says (emphasis mine):
Add support for getting nanosecond-resolution time stamps when capturing and reading capture files
I did see the pcap_set_tstamp_type() function and the pcap-tstamp man page, which says:
PCAP_TSTAMP_HOST—host: Time stamp provided by the host on which the capture is being done. The precision of this time stamp is unspecified; it might or might not be synchronized with the host operating system's clock.
PCAP_TSTAMP_HOST_LOWPREC—host_lowprec: Time stamp provided by the host on which the capture is being done. This is a low-precision time stamp, synchronized with the host operating system's clock.
PCAP_TSTAMP_HOST_HIPREC—host_hiprec: Time stamp provided by the host on which the capture is being done. This is a high-precision time stamp; it might or might not be synchronized with the host operating system's clock. It might be more expensive to fetch than PCAP_TSTAMP_HOST_LOWPREC.
PCAP_TSTAMP_ADAPTER—adapter: Time stamp provided by the network adapter on which the capture is being done. This is a high-precision time stamp, synchronized with the host operating system's clock.
PCAP_TSTAMP_ADAPTER_UNSYNCED—adapter_unsynced: Time stamp provided by the network adapter on which the capture is being done. This is a high-precision time stamp; it is not synchronized with the host operating system's clock.
Does the phrase "high-precision time stamp" here mean that nanosecond values are stored in the header's ts.tv_usec field? If so, PCAP_TSTAMP_HOST says "unspecified", so how do I determine at runtime whether the ts.tv_usec field holds microseconds or nanoseconds? And which of these is the default if pcap_set_tstamp_type() is never called?
pcap_create() does little if anything to set parameters for the capture device, and has no alternative calls for setting those parameters; this is by design. The intent, at the time pcap_create() and pcap_activate() were introduced, was that neither of those calls would have to be changed in order to support new parameters, and that new APIs would be introduced as new parameters are introduced.
You're supposed to call pcap_create() to create a not-yet-activated handle, set the parameters with the appropriate calls, and then attempt to activate the handle with pcap_activate().
One of the appropriate calls is pcap_set_tstamp_precision(), which is the call you use between pcap_create() and pcap_activate() to specify that you want nanosecond-precision time stamps. The default is microsecond-precision time stamps, for backwards source and binary compatibility.
Note that pcap_set_tstamp_precision() will fail if you can't get nanosecond-precision time stamps from the device on which you're capturing, so you must check whether it succeeds or fails or call pcap_get_tstamp_precision() after activating the pcap_t in order to see what time stamp precision you'll be getting.
And, no, "high-precision" has nothing to do with whether you get microseconds or nanoseconds, it has to do with whether the nominal microseconds or nanoseconds value really provide microsecond or nanosecond granularity or whether you'll always get values that are multiples of a power of 10 because the clock being used doesn't measure down to the microsecond or nanosecond.

Resources