[ntpwg] Testing NTP performance

David L. Mills mills at udel.edu
Mon May 5 01:45:28 UTC 2008


Harlan,

It's hard with my limited eyesight to respond inline.

There are two themes running through your questions, whether the beast 
is hungry and being fed on one hand and how rich are the calories on the 
other. There has been a god deal of recent effor on the first theme in 
the revamped monitoring statistics, in particular the new protostats. 
These are the quickest way to assess whether the beast is working 
correctly now and in the recent past. Take a look on these at pogo 
and/or rackety. These record all significant events that might occur in 
a specification-conformant implementation.

Counting calories is harder, especially about your questions about 
question answers. This is a statistically noisy place and full, it sems, 
of urban legends. The short answers are in the specification. Read the 
definitions carefully and most of what you need pops right out - peer 
offset, delay, jitter and dispersion plus their root derivatives. Folks 
might like to know aout nominal expectations are in the wild; well, a 
few minutes with ntpq can be most entertaining. As for the rest, well, 
there many briefings on the project page and support site; there is a 
faq and, well, even a book.

 From my perception as a teacher, my undergraduate students hate to go 
to briefing documents and even less to a textbook. I read textbooks for 
fun (so thaaats how it works...). Everybody wants to google for 
everything, even when a carefully prepared table of contents and list of 
index terms are available. (I just got 212,000 hits on "autokey 
protocol"; all I checked were for the real thimg.)

Dave

Harlan Stenn wrote:

> Dave,
>
> Please help me out here.
>
>> In your question below the measured offset is the maximum liklihood
>> estimate of the client offset relative to the server, while the jitter
>> represents the estimated error of the offset estimate itself. The
>> synchronization distance represents the maximum error due all causes
>> with the assumption that the maximum inherent clock frequency error is
>> bounded. The assumed parameters for phase and frequency error estimate
>> are represented by the limbs of the Allan deviation specific to each
>> installation. There's not much more than can be said abount the
>> probabilistics.
>
>
> So if somebody wants the answer to ask the question "Does ntpd think the
> time on the machine is OK and sync'd?" the first thing we all agree on
> is that's not the best question to ask. But we should be able to
> determine:
>
> Q: Does ntpd think it is sync'd? To something "useful"?
> A: Look at the (system) status word in the response to a mode-6 control
> message, and make sure:
> - LI is not 11
> - the value of "clock source" is something tolerable
>
> Q: What time does ntpd think it is?
> A: do "the dance of the 4 timestamps".
>
> Q: How close can we expect the previous answer to be to the "correct"
> time?
> A: Check out lots of error-related variables and pick the ones you think
> you want. This includes root dispersion, jitter, and synchronization
> distance. Remember that the answer is a probability. (OK, we also
> need to be able to report that probability.)
>
> Q: How is the previous answer affected by "slew-only" mode?
> A: (Harlan does not currently know the answer to this one.)
>
> Q: Are any of the above questions superfluous? If so, please explain.
>
> Q: Should we be asking any additional or different questions?
>
> This general topic is a FAQ, and there is a need to have it covered in
> sufficient detail. Dave, while it may not be productive for *you* to
> answer this (either once or rarely), I believe it is critical that we
> can point people at a place to get an understanding of the issues that
> is more than "read the spec and Dave's book and then study the problem".
>
> Of course, the next thing that will happen is that people will tell us:
>
> - We want to meet the following "performance" target at a given
> statistical level.
> - Based on the spec, we can see the target above is reachable.
> - We are monitoring the values (X), and we are seeing (Y) which seems to
> be out of spec.
> - - how do we figure out what the problem is?
> - - how do we fix it?
>



More information about the ntpwg mailing list