On 2/20/14, grarpamp grarpamp@gmail.com wrote:
Since you say it repeats you oppurtunity to check the system clock first:
- configure tor to syslog
added
- send an ntpdate -q pool to syslog every 5min,
remove when solved.
Do you mean disable ntpd daemon, and run this instead? Sounds easy enough, I imagine: service ntp stop; while true; do ntpd -gqn -l /var/log/syslog; sleep 5m; done service ntp start
Something like -L seems to be needed but -L doesn't stop, the following - I don't know what these sorts of lines are (reading docs now, but it might take a while to figure it out) - ie I don't know why ntpd listens on LAN: 20 Feb 18:31:26 ntpd[22655]: Listen normally on 3 eth1 192.168.5.2 UDP 123
BTW, looks like ntpd has the same options as ntpdate, intended for the same functions (at least on Debian, says ntpdate is deprecated).
Now, here's the last short while of this ntpd script output: ntpd: time slew -0.015558s ntpd: time slew +0.062676s ntpd: time set +0.191404s ntpd: time set +0.187068s ntpd: time slew -0.029898s ntpd: time slew +0.027801s
So it seems that the slew is somehow not being set properly, or rather, now that ntpd is being run every 5 minutes, it gets to add about .2 of a second pretty regularly (I'll continue to watch of course), so something definitely seems amiss. I'm not loading the system default ntpd config file.
It looks like time could be running slow and being _not_ updated, except a few times a day, resulting in the 2-3minute jump.
- send *.* to /var/log/all
What's that mean? I need just a little more handholding sorry. Is this intended to be a torrc config? It sounds like a good idea to send everything to one log file for a while, till I debug this.
I'll get back to this, but just for the moment, I notice some interesting repeating lines all over daemon.log re ntpd (but not recently):
... daemon.log:Feb 18 03:10:08 lt8 ntpd[2845]: peer 203.24.120.194 now valid daemon.log:Feb 18 03:15:37 lt8 ntpd[2845]: peer 203.59.9.110 now invalid daemon.log:Feb 18 03:28:34 lt8 ntpd[2843]: adjusting clock frequency by 18.354661 to 3.931566ppm daemon.log:Feb 18 04:08:41 lt8 ntpd[2845]: peer 203.59.9.110 now valid daemon.log:Feb 18 04:15:29 lt8 ntpd[2845]: peer 118.88.20.194 now invalid daemon.log:Feb 18 04:15:34 lt8 ntpd[2845]: peer 130.102.2.123 now invalid daemon.log:Feb 18 04:15:47 lt8 ntpd[2845]: peer 203.171.85.237 now invalid daemon.log:Feb 18 04:15:52 lt8 ntpd[2845]: peer 202.127.210.37 now invalid daemon.log:Feb 18 04:16:01 lt8 ntpd[2845]: peer 202.147.104.52 now invalid daemon.log:Feb 18 04:30:05 lt8 ntpd[2845]: peer 203.59.9.110 now invalid daemon.log:Feb 18 04:59:11 lt8 ntpd[2845]: 10 out of 20 peers valid daemon.log:Feb 18 04:59:11 lt8 ntpd[2845]: bad peer from pool 0.debian.pool.ntp.org (130.102.2.123) daemon.log:Feb 18 04:59:11 lt8 ntpd[2845]: bad peer from pool 0.debian.pool.ntp.org (202.127.210.37) daemon.log:Feb 18 04:59:11 lt8 ntpd[2845]: bad peer from pool 1.debian.pool.ntp.org (203.59.9.110) ...
nothing similar in today's daemon.log though.
and see what you find around where the date lines start to slide or jump past each other. Graph it with rrd/gnuplot if you want. epoch format helps there. Your timekeeping is probably just broke somewhere,
sounds like it.
ie: your system has bad clock drift and also can't sync because there's a firewall blocking ntp, so off goes your time. Check that first.
Here's what I'm getting every 5 minutes:
Feb 20 18:46:59 lt8 ntpd[22662]: ntpd 4.2.6p5@1.2349-o Sat May 12 09:07:18 UTC 2012 (1) 20 Feb 18:46:59 ntpd[22662]: proto: precision = 2.514 usec 20 Feb 18:46:59 ntpd[22662]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123 Feb 20 18:46:59 lt8 ntpd[22662]: logging to file /var/log/syslog 20 Feb 18:46:59 ntpd[22662]: Listen and drop on 1 v6wildcard :: UDP 123 20 Feb 18:46:59 ntpd[22662]: Listen normally on 2 lo 127.0.0.1 UDP 123 20 Feb 18:46:59 ntpd[22662]: Listen normally on 3 eth0 192.168.5.2 UDP 123 20 Feb 18:46:59 ntpd[22662]: Listen normally on 4 eth0 fe80::202:8aff:fe21:77f1 UDP 123 20 Feb 18:46:59 ntpd[22662]: Listen normally on 5 lo ::1 UDP 123 20 Feb 18:46:59 ntpd[22662]: peers refreshed 20 Feb 18:46:59 ntpd[22662]: Listening on routing socket on fd #22 for interface updates 20 Feb 18:47:12 ntpd[22662]: ntpd: time slew +0.027801 s
again, I don't (yet) understand why ntpd is Listening as seen above.
there appears to be no problems with firewall, that I can tell yet anyway (also eg the logs above - I've done some grepping etc over the last few days of logs and everything outgoing appears to work, and incoming ORPort DIRPort are successfully forwarded and the relay is chugging away nicely within its limits. Also Internet generally works (downloading debian boot cd for example) - and no ntp server connection failures I could spot in the logs just before.
It's not your isp since you say you're using ntpd against external debian pool and odds are not someone stuffing you bogus timedata. Though your ntp cli query may not be the same as the ntp daemon query re: udp/tcp port they use, stateful firewall timeout, etc... ie: somehow ntp blocked. Or stale/unwriteable startup drift files on disk. If ntpdate is set, then under ntpd running for 15min+, if ntpq -np does not show one asterisk(*) in front
I am running the 5-minutely script at the moment. But perhaps I should keep running the system ntpd at the same time, so I can do this suggested test?
of one of the nonlocal peers, you're not synced. And you'll be no luck until you are, fix that first.
Thank you so much for the pointers - I've a bit of monitoring and testing and reading to do it seems.
Not likely to be Tor or kernel, but you can then next
- move the drive to another known good mobo/cpu box.
- do the kernel event logging thing
- bump tor's loglevel
Thanks again, Zenaan