[ntpwg] Testing NTP performance
David L. Mills
mills at udel.edu
Mon May 5 01:45:28 UTC 2008
Harlan,
It's hard with my limited eyesight to respond inline.
There are two themes running through your questions, whether the beast
is hungry and being fed on one hand and how rich are the calories on the
other. There has been a god deal of recent effor on the first theme in
the revamped monitoring statistics, in particular the new protostats.
These are the quickest way to assess whether the beast is working
correctly now and in the recent past. Take a look on these at pogo
and/or rackety. These record all significant events that might occur in
a specification-conformant implementation.
Counting calories is harder, especially about your questions about
question answers. This is a statistically noisy place and full, it sems,
of urban legends. The short answers are in the specification. Read the
definitions carefully and most of what you need pops right out - peer
offset, delay, jitter and dispersion plus their root derivatives. Folks
might like to know aout nominal expectations are in the wild; well, a
few minutes with ntpq can be most entertaining. As for the rest, well,
there many briefings on the project page and support site; there is a
faq and, well, even a book.
From my perception as a teacher, my undergraduate students hate to go
to briefing documents and even less to a textbook. I read textbooks for
fun (so thaaats how it works...). Everybody wants to google for
everything, even when a carefully prepared table of contents and list of
index terms are available. (I just got 212,000 hits on "autokey
protocol"; all I checked were for the real thimg.)
Dave
Harlan Stenn wrote:
> Dave,
>
> Please help me out here.
>
>> In your question below the measured offset is the maximum liklihood
>> estimate of the client offset relative to the server, while the jitter
>> represents the estimated error of the offset estimate itself. The
>> synchronization distance represents the maximum error due all causes
>> with the assumption that the maximum inherent clock frequency error is
>> bounded. The assumed parameters for phase and frequency error estimate
>> are represented by the limbs of the Allan deviation specific to each
>> installation. There's not much more than can be said abount the
>> probabilistics.
>
>
> So if somebody wants the answer to ask the question "Does ntpd think the
> time on the machine is OK and sync'd?" the first thing we all agree on
> is that's not the best question to ask. But we should be able to
> determine:
>
> Q: Does ntpd think it is sync'd? To something "useful"?
> A: Look at the (system) status word in the response to a mode-6 control
> message, and make sure:
> - LI is not 11
> - the value of "clock source" is something tolerable
>
> Q: What time does ntpd think it is?
> A: do "the dance of the 4 timestamps".
>
> Q: How close can we expect the previous answer to be to the "correct"
> time?
> A: Check out lots of error-related variables and pick the ones you think
> you want. This includes root dispersion, jitter, and synchronization
> distance. Remember that the answer is a probability. (OK, we also
> need to be able to report that probability.)
>
> Q: How is the previous answer affected by "slew-only" mode?
> A: (Harlan does not currently know the answer to this one.)
>
> Q: Are any of the above questions superfluous? If so, please explain.
>
> Q: Should we be asking any additional or different questions?
>
> This general topic is a FAQ, and there is a need to have it covered in
> sufficient detail. Dave, while it may not be productive for *you* to
> answer this (either once or rarely), I believe it is critical that we
> can point people at a place to get an understanding of the issues that
> is more than "read the spec and Dave's book and then study the problem".
>
> Of course, the next thing that will happen is that people will tell us:
>
> - We want to meet the following "performance" target at a given
> statistical level.
> - Based on the spec, we can see the target above is reachable.
> - We are monitoring the values (X), and we are seeing (Y) which seems to
> be out of spec.
> - - how do we figure out what the problem is?
> - - how do we fix it?
>
More information about the ntpwg
mailing list