[ntpwg] NTP vs. PTP (was: Documents, slides, etc. from WG meeting)
Martin Burnicki
martin.burnicki at gmx.de
Wed Oct 24 13:45:37 UTC 2007
Hi Brad,
Brad Knowles wrote:
> From what little I can tell, it seems to me that IEEE 1588 is much
> more oriented towards a closed single-entity local area (or
> near-local area) network environment, where you don't care if you're
> sync'ed to any given external reference, or how close or far you may
> be from The One True Time, you just care that all your internal
> slaves are sync'ed to your given master, and that you have some
> pretty tight tolerances on how far out of sync the slaves can get.
Yep. However, it depends. Basically IEEE 1588 (PTP) is capable of sync'ing all
clients to a server/grandmaster with high accuracy. If the application does
not only rely on the _same_ time for all nodes but also on the _right_ time
the you just have to take care that the grandmaster is sync'ed e.g. to a GPS
receiver.
BTW, the same is essentially true for NTP.
> In contrast, it seems to me that NTP works well on the broader WAN
> multi-entity internet (global and beyond), and that it can work just
> fine on smaller scale single-entity networks where sub-nanosecond
> accuracy timekeeping is not required.
>
>
> The two protocols seem to complement each other pretty well, and it
> would appear that the folks at Meinberg agree.
I absolutely agree.
The main difference between NTP and PTP is that the statistics implemented by
the NTP algorithms yield quite good results even over WAN connections, and
just using standard hardware.
PTP can achieve much higher accuracy than NTP, but this is _only_ if special
hardware is used which explicitely supports PTP. And this basically applies
to both PTP v1 and v2.
Imagine you want to send a packet with "the right time" from one PC to
another.
1.) The sending program picks up a time stamp and puts it into
a network packet.
2.) The network packet is then passed to the IP protocol stack where
it is passed down from the sending user space application to the
network drivers which partially run in kernel space, where it
finally ends up in a send queue. This introduces a delay which
depends on the CPU power and the system load (interrupt requests).
3.) The network driver waits until the network wire is unused and
starts to transmit the packet. If there's a collision on the wire
then transmission is aborted and retried after a random delay.
This also causes a random delay.
4.) Once the packet is on the wire the propagation delay is pretty
constant, depending on the length of the wire. If there's a
network hub between the sender and the receiver then this also
introduces an additional delay, which is pretty constant, though.
If there's a router or switch between the 2 nodes then the packet
may be queued for an undetermined amount of time, which also
results in an unknown delay.
5.) If the packet arrives at its destination then the network driver
generates an interrupt request to let the packet be fetched by
the protocol drivers. It also takes an unknown amount of time
until this is done, depending on the CPU power, whether there
are higher-prioritized interrupts just being handled, etc.
6.) Finally the packet is moved up the protocol stack, moved from
kernel space back to user space, and passed to the application
which then takes a time stamp of its own system time in order
to compute the difference to the time stamp from the incoming
packet.
There's a bunch of unknown delays there, isn't it?
The only delay which is constant and can be measured is the propagation delay
on the cable.
The other effects introduce a more or less random delay which can only be
estimated by statistical methods. That's what the NTP algorithms do with
quite good results. The advantage is that no special hardware is required,
but the disavantage is that you can only yield a limited accuracy.
In order to compensate the receive delay (i.e. when a packet comes in from the
wire until it arrives at the application) you have to take a time stamp when
a packet comes in from the wire. This is done by a time stamp unit which
includes a pattern matcher which has to identify incoming packets in the bit
stream from the wire, and take a time stamp if such a packet is detected.
Both the network card driver and the application have to provide a way (API
call) which lets the application retrieve that time stamp from the NIC driver
and assign it to the associated (right) packet. This way the application can
compute the difference between the time it has received the packet and the
time the packet has arrived from the wire, and account for that delay.
Obviously the same has to be done for outgoing packets, i.e. determine the
time interval from when the packet is sent by the application until it really
goes onto the wire. The calculated delay has to be passed to the receiver
which has to account for that delay. Unfortunately the time stamp can only be
taken when the packet goes out onto the wire, so when the time stamp is
available the packet has already been sent. The PTP protocol accounts for
this situation by sending a so-called follow-up packet which contains the
time stamp of the previous packet. The receiver then gets the original packet
which is time-stamped when coming in, plus the follow-up packet which
contains the transmission delay and can thus account for both the delays.
Using a point-to-point connection between the transmitting and the receiving
node you can yield an accuracy down to a couple of nanoseconds by hardware
timestamping.
However, there's still a last delay which is not yet known by our server and
client. This is the propagation delay across intermediate nodes like switches
and routers. For example, if a switch receives an incoming packet at one
port, and the outgoing port is just be used by another packet then the
incoming packet is queued internally in a FIFO. This can take up to several
tens of milliseconds (!), depending on the network load and the queue depth.
The problem is that neither the transmitting nor the receiving node can
determine whether a packet has been passed on directly, or has been queued,
and for how long it has been queued.
So even if both your endpoints provide a way for hardware time stamping, a
single standard switch between them can screw up the accuracy. The only way
to avoid this are either to use "dumb" hubs which just duplicate the packets
without queueing them, or to use special switches which are aware of special
timing packets and handle them in a special way.
The PTP protocol defines a special "transparent" or "boundary clock" which can
be implemented in switches or routers in order to handle the PTP packets in a
special way which compensates the switch's delays. The Hirschmann switch
which we offer as part of our PTP starter kit
http://www.meinberg.de/english/ptp-starterkit
supports the PTP protocol that way.
Since standard PC NICs don't provide hardware TSU support nowadays, you must
install a special PTP NIC (normally a PCI card) in each of the PCs anyway.
Here is a short summary on the 3 ways to assign timestamps to specific
outgoing and incoming network packets:
1.) Inside the application, if a packet is sent or received. This is very
portable since it runs completely in user space, and this is the way
currently used by NTP. You can also run this on operating systems where you
don't have access to the source code of the NIC drivers. However, this is the
most inaccurate way since the time stamps depend on the latency of the IP
stack, and on network collisions.
2.) Inside the NIC driver, e.g. in the interrupt service routine. This avoids
the latencies of the IP stack, but you must modify the source code of the
driver, or have a driver which takes time stamps if certain packets are being
sent or received. Also, you don't know the latency due to network collisions
when sending packets.
3.) By the NIC hardware. You need a time stamp unit (TSU) which listens on the
data lines between the MAC and the PHY. The TSU contains a pattern matching
unit which detects the pattern of the desired packet type in the serial bit
stream on those data lines. If such a pattern is detected then the pattern
matching unit must capture a time stamp of a high speed counter which is also
part of the TSU. Since this method takes time stamps when a packet goes on
the wire or arrives on the wire is also capable to eliminate the latency due
to network collisions.
The driver must be able to read those time stamps from the TSU and either
assign them to the packets or pass them up the protocol stack to the
application which then assigns the time stamps to the packets. The TSU can be
implemented in a programmable logic device like an ASIC or FPGA.
The latter method is the most accurate, but it requires a NIC with a TSU, and
a kernel driver which supports the TSU. The problem here is that most
commonly used NICs have the MAC and the PHY in a single chip, and the data
lines on which the TSU must listen is not accessible outside the chip.
However, there are also some new NIC chips available which have a built-in
TSU.
As a conclusion you can say that
1.) NTP can achieve pretty good accuracy in both small and large networks
where you don't know which routes the packets take, and you can not and don't
have to rely on special hardware support for the protocol.
2.) PTP can yield very high accuracy provided that the network infrastructure
fully supports the protocol. Obviously this is easier to implement in a
closed network where the administrators have full control over the equipment.
3.) Once the special hardware support for PTP is not available, PTP suffers
from the same limitations as NTP, i.e. the unknown latencies mentioned above.
Undere these conditions NTP can even yield better results due to the
statistical methods it uses.
The other way round, we have made tests with NTP using the same hardware
timestamping methods, and we could see that NTP can yield the same accuracy
as PTP if the basic conditions are similar.
The problem here is that the current specification of the NTP protocol does
not provide a method to send a follow-up message to the client in order to
let the client know when the original packet really made its way onto the
wire. If you add such a method then it breaks compatibility with existing
implementations of NTP.
It's an advantage for the PTP protocol that it could be implemented in a way
which adds that option missing for NTP.
Best regards,
Martin
--
Martin Burnicki
Meinberg Funkuhren
Bad Pyrmont
Germany
More information about the ntpwg
mailing list