[ntpwg] NTP vs. PTP (was: Documents, slides, etc. from WG meeting)

Martin Burnicki martin.burnicki at gmx.de
Wed Oct 24 13:45:37 UTC 2007


Hi Brad,

Brad Knowles wrote:
>  From what little I can tell, it seems to me that IEEE 1588 is much
> more oriented towards a closed single-entity local area (or
> near-local area) network environment, where you don't care if you're
> sync'ed to any given external reference, or how close or far you may
> be from The One True Time, you just care that all your internal
> slaves are sync'ed to your given master, and that you have some
> pretty tight tolerances on how far out of sync the slaves can get.

Yep. However, it depends. Basically IEEE 1588 (PTP) is capable of sync'ing all 
clients to a server/grandmaster with high accuracy. If the application does 
not only rely on the _same_ time for all nodes but also on the _right_ time 
the you just have to take care that the grandmaster is sync'ed e.g. to a GPS 
receiver.

BTW, the same is essentially true for NTP.

> In contrast, it seems to me that NTP works well on the broader WAN
> multi-entity internet (global and beyond), and that it can work just
> fine on smaller scale single-entity networks where sub-nanosecond
> accuracy timekeeping is not required.
>
>
> The two protocols seem to complement each other pretty well, and it
> would appear that the folks at Meinberg agree.

I absolutely agree. 

The main difference between NTP and PTP is that the statistics implemented by 
the NTP algorithms yield quite good results even over WAN connections, and 
just using standard hardware.

PTP can achieve much higher accuracy than NTP, but this is _only_ if special 
hardware is used which explicitely supports PTP. And this basically applies 
to both PTP v1 and v2.

Imagine you want to send a packet with "the right time" from one PC to
another.

1.) The sending program picks up a time stamp and puts it into
    a network packet.

2.) The network packet is then passed to the IP protocol stack where
    it is passed down from the sending user space application to the
    network drivers which partially run in kernel space, where it
    finally ends up in a send queue. This introduces a delay which
    depends on the CPU power and the system load (interrupt requests).

3.) The network driver waits until the network wire is unused and
    starts to transmit the packet. If there's a collision on the wire
    then transmission is aborted and retried after a random delay.
    This also causes a random delay.

4.) Once the packet is on the wire the propagation delay is pretty
    constant, depending on the length of the wire. If there's a
    network hub between the sender and the receiver then this also
    introduces an additional delay, which is pretty constant, though.
    If there's a router or switch between the 2 nodes then the packet
    may be queued for an undetermined amount of time, which also
    results in an unknown delay.

5.) If the packet arrives at its destination then the network driver
    generates an interrupt request to let the packet be fetched by
    the protocol drivers. It also takes an unknown amount of time
    until this is done, depending on the CPU power, whether there
    are higher-prioritized interrupts just being handled, etc.

6.) Finally the packet is moved up the protocol stack, moved from
    kernel space back to user space, and passed to the application
    which then takes a time stamp of its own system time in order
    to compute the difference to the time stamp from the incoming
    packet.

There's a bunch of unknown delays there, isn't it?

The only delay which is constant and can be measured is the propagation delay 
on the cable.

The other effects introduce a more or less random delay which can only be 
estimated by statistical methods. That's what the NTP algorithms do with 
quite good results. The advantage is that no special hardware is required, 
but the disavantage is that you can only yield a limited accuracy.

In order to compensate the receive delay (i.e. when a packet comes in from the 
wire until it arrives at the application) you have to take a time stamp when 
a packet comes in from the wire. This is done by a time stamp unit which 
includes a pattern matcher which has to identify incoming packets in the bit 
stream from the wire, and take a time stamp if such a packet is detected. 
Both the network card driver and the application have to provide a way (API 
call) which lets the application retrieve that time stamp from the NIC driver 
and assign it to the associated (right) packet. This way the application can 
compute the difference between the time it has received the packet and the 
time the packet has arrived from the wire, and account for that delay.

Obviously the same has to be done for outgoing packets, i.e. determine the 
time interval from when the packet is sent by the application until it really 
goes onto the wire. The calculated delay has to be passed to the receiver 
which has to account for that delay. Unfortunately the time stamp can only be 
taken when the packet goes out onto the wire, so when the time stamp is 
available the packet has already been sent. The PTP protocol accounts for 
this situation by sending a so-called follow-up packet which contains the 
time stamp of the previous packet. The receiver then gets the original packet 
which is time-stamped when coming in, plus the follow-up packet which 
contains the transmission delay and can thus account for both the delays. 

Using a point-to-point connection between the transmitting and the receiving 
node you can yield an accuracy down to a couple of nanoseconds by hardware 
timestamping.

However, there's still a last delay which is not yet known by our server and 
client. This is the propagation delay across intermediate nodes like switches 
and routers. For example, if a switch receives an incoming packet at one 
port, and the outgoing port is just be used by another packet then the 
incoming packet is queued internally in a FIFO. This can take up to several 
tens of milliseconds (!), depending on the network load and the queue depth.

The problem is that neither the transmitting nor the receiving node can 
determine whether a packet has been passed on directly, or has been queued, 
and for how long it has been queued.

So even if both your endpoints provide a way for hardware time stamping, a 
single standard switch between them can screw up the accuracy. The only way 
to avoid this are either to use "dumb" hubs which just duplicate the packets 
without queueing them, or to use special switches which are aware of special 
timing packets and handle them in a special way.

The PTP protocol defines a special "transparent" or "boundary clock" which can 
be implemented in switches or routers in order to handle the PTP packets in a 
special way which compensates the switch's delays. The Hirschmann switch 
which we offer as part of our PTP starter kit
http://www.meinberg.de/english/ptp-starterkit
supports the PTP protocol that way.

Since standard PC NICs don't provide hardware TSU support nowadays, you must 
install a special PTP NIC (normally a PCI card) in each of the PCs anyway.

Here is a short summary on the 3 ways to assign timestamps to specific 
outgoing and incoming network packets:

1.) Inside the application, if a packet is sent or received. This is very 
portable since it runs completely in user space, and this is the way 
currently used by NTP. You can also run this on operating systems where you 
don't have access to the source code of the NIC drivers. However, this is the 
most inaccurate way since the time stamps depend on the latency of the IP 
stack, and on network collisions.

2.) Inside the NIC driver, e.g. in the interrupt service routine. This avoids 
the latencies of the IP stack, but you must modify the source code of the 
driver, or have a driver which takes time stamps if certain packets are being 
sent or received. Also, you don't know the latency due to network collisions 
when sending packets.

3.) By the NIC hardware. You need a time stamp unit (TSU) which listens on the 
data lines between the MAC and the PHY. The TSU contains a pattern matching 
unit which detects the pattern of the desired packet type in the serial bit 
stream on those data lines. If such a pattern is detected then the pattern 
matching unit must capture a time stamp of a high speed counter which is also 
part of the TSU. Since this method takes time stamps when a packet goes on 
the wire or arrives on the wire is also capable to eliminate the latency due 
to network collisions. 
The driver must be able to read those time stamps from the TSU and either 
assign them to the packets or pass them up the protocol stack to the 
application which then assigns the time stamps to the packets. The TSU can be 
implemented in a programmable logic device like an ASIC or FPGA.

The latter method is the most accurate, but it requires a NIC with a TSU, and 
a kernel driver which supports the TSU. The problem here is that most 
commonly used NICs have the MAC and the PHY in a single chip, and the data 
lines on which the TSU must listen is not accessible outside the chip. 
However, there are also some new NIC chips available which have a built-in 
TSU.

As a conclusion you can say that 

1.) NTP can achieve pretty good accuracy in both small and large networks 
where you don't know which routes the packets take, and you can not and don't 
have to rely on special hardware support for the protocol.

2.) PTP can yield very high accuracy provided that the network infrastructure 
fully supports the protocol. Obviously this is easier to implement in a 
closed network where the administrators have full control over the equipment.

3.) Once the special hardware support for PTP is not available, PTP suffers 
from the same limitations as NTP, i.e. the unknown latencies mentioned above. 
Undere these conditions NTP can even yield better results due to the 
statistical methods it uses. 

The other way round, we have made tests with NTP using the same hardware 
timestamping methods, and we could see that NTP can yield the same accuracy 
as PTP if the basic conditions are similar.

The problem here is that the current specification of the NTP protocol does 
not provide a method to send a follow-up message to the client in order to 
let the client know when the original packet really made its way onto the 
wire. If you add such a method then it breaks compatibility with existing 
implementations of NTP.

It's an advantage for the PTP protocol that it could be implemented in a way 
which adds that option missing for NTP.


Best regards,

Martin
-- 
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany


More information about the ntpwg mailing list