OPNsense Forum
Archive => 15.7 Legacy Series => Topic started by: phoenix on September 14, 2015, 01:01:45 pm
-
There seems to be an error in the startup of the NTP daemon, after a reboot it doesn't start. The startup script contains the following:
#!/bin/sh
#
# $FreeBSD$
#
# PROVIDE: ntpd
# REQUIRE: DAEMON ntpdate FILESYSTEMS devfs
# BEFORE: LOGIN
# KEYWORD: nojail shutdown
. /etc/rc.subr
name="ntpd"
rcvar="ntpd_enable"
command="/usr/sbin/${name}"
pidfile="/var/run/${name}.pid"
start_precmd="ntpd_precmd"
load_rc_config $name
ntpd_precmd()
{
rc_flags="-c ${ntpd_config} ${ntpd_flags}"
if checkyesno ntpd_sync_on_start; then
rc_flags="-g $rc_flags"
fi
if [ -z "$ntpd_chrootdir" ]; then
return 0;
fi
# If running in a chroot cage, ensure that the appropriate files
# exist inside the cage, as well as helper symlinks into the cage
# from outside.
#
# As this is called after the is_running and required_dir checks
# are made in run_rc_command(), we can safely assume ${ntpd_chrootdir}
# exists and ntpd isn't running at this point (unless forcestart
# is used).
#
if [ ! -c "${ntpd_chrootdir}/dev/clockctl" ]; then
rm -f "${ntpd_chrootdir}/dev/clockctl"
( cd /dev ; /bin/pax -rw -pe clockctl "${ntpd_chrootdir}/dev" )
fi
ln -fs "${ntpd_chrootdir}/var/db/ntp.drift" /var/db/ntp.drift
# Change run_rc_commands()'s internal copy of $ntpd_flags
#
rc_flags="-u ntpd:ntpd -i ${ntpd_chrootdir} $rc_flags"
}
run_rc_command "$1"
A quick find shows the location of the executable to be at: /usr/local/sbin/ntpd & changing the location in the startup script fixes the problem. I have had this for a while but I only reboot infrequently so I never really looked at it, I haven't seen anyone mention this in the forums and surely I can't be the only one who's encountered it?
-
ntpd is not started this way. Why do you think it doesn't start? Did you want to use /etc/rc.conf to start it?
-
It could be my misunderstanding of what happens with NTPD, the reason I didn't think it was running is because the service is always shown as stopped.
-
It has a delayed start (that looks odd if you don't know), or it may have crashed. What happens if you trigger it via the GUI and give it a bit of slack (killing the manual process first of course)?
-
As I mentioned earlier, I didn't bother with this at first. When I rebooted and then looked at main page (possibly after a day or so) I noticed that the NTP daemon was stopped and starting it via the GUI worked fine and it stayed up until the next reboot. Making the change I mentioned earlier starts it a boot and it stays up, should I revert my change?
-
Yes, it's likely that the ntp daemon comes up with a wrong config and the GUI won't able to reconfigure it at all even though it reports one as being online.
If it stopped for some reason, we should see an error in the log and take it from there. :)
-
Now I'm confused (nothing new there :) ), I reverted the change I made to /etc/rc.d/ntpd and rebooted - it started without problems and continues to run with no errors in the log file. I can't see any previous errors so I think this "problem" can be forgotten but I'll keep an eye on it. Just an idea but is thee anything that could send an email to an admin if any services go down and/or restart or is it even worth doing it?
-
A further update. After leaving this for several hours the daemon is still running, I rebooted and the daemon is now stopped but it doesn't show any errors in the ntp.log file just a successful time sync. I copied below the log for both of the reboots today, I suppose what I'm asking now is, should the normal state of the daemon in the web ui be shown as stopped?
Sep 15 09:27:54 opnsense ntpdate[67572]: adjust time server 129.250.35.251 offset -0.003927 sec
Sep 15 09:27:54 opnsense ntp: Successfully synced time after 1 attempts.
Sep 15 09:27:54 opnsense ntp: Starting NTP Daemon.
Sep 15 09:27:54 opnsense ntpd[51584]: ntpd 4.2.8p3@1.3265-o Fri Sep 11 22:45:50 UTC 2015 (1): Starting
Sep 15 09:27:54 opnsense ntpd[51584]: Command line: /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid
Sep 15 09:27:54 opnsense ntpd[51860]: proto: precision = 0.056 usec (-24)
Sep 15 09:27:54 opnsense ntpd[51860]: restrict: 'monitor' cannot be disabled while 'limited' is enabled
Sep 15 09:27:54 opnsense ntpd[51860]: Listen and drop on 0 v6wildcard [::]:123
Sep 15 09:27:54 opnsense ntpd[51860]: Listen and drop on 1 v4wildcard 0.0.0.0:123
Sep 15 09:27:54 opnsense ntpd[51860]: Listen normally on 2 vmx0 [fe80::250:56ff:fe8d:3fd0%1]:123
Sep 15 09:27:54 opnsense ntpd[51860]: Listen normally on 3 vmx0 82.246.46.14:123
Sep 15 09:27:54 opnsense ntpd[51860]: Listen normally on 4 vmx0 [2a01:e35:2f62:e0e0:250:56ff:fe8d:3fd0]:123
Sep 15 09:27:54 opnsense ntpd[51860]: Listen normally on 5 vmx1 192.168.1.1:123
Sep 15 09:27:54 opnsense ntpd[51860]: Listen normally on 6 vmx1 [fe80::1:1%2]:123
Sep 15 09:27:54 opnsense ntpd[51860]: Listen normally on 7 vmx2 192.168.1.240:123
Sep 15 09:27:54 opnsense ntpd[51860]: Listen normally on 8 vmx2 [fe80::250:56ff:fe8d:23a9%3]:123
Sep 15 09:27:54 opnsense ntpd[51860]: Listen normally on 9 lo0 127.0.0.1:123
Sep 15 09:27:54 opnsense ntpd[51860]: Listen normally on 10 lo0 [::1]:123
Sep 15 09:27:54 opnsense ntpd[51860]: Listen normally on 11 lo0 [fe80::1%7]:123
Sep 15 09:27:54 opnsense ntpd[51860]: Listening on routing socket on fd #32 for interface updates
Sep 15 11:50:00 opnsense ntpdate[71143]: adjust time server 129.250.35.251 offset 0.000458 sec
Sep 15 11:50:00 opnsense ntp: Successfully synced time after 1 attempts.
Sep 15 11:50:00 opnsense ntp: Starting NTP Daemon.
Sep 15 11:50:00 opnsense ntpd[81955]: ntpd 4.2.8p3@1.3265-o Fri Sep 11 22:45:50 UTC 2015 (1): Starting
Sep 15 11:50:00 opnsense ntpd[81955]: Command line: /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid
Sep 15 11:50:00 opnsense ntpd[82140]: proto: precision = 0.055 usec (-24)
Sep 15 11:50:00 opnsense ntpd[82140]: restrict: 'monitor' cannot be disabled while 'limited' is enabled
Sep 15 11:50:00 opnsense ntpd[82140]: Listen and drop on 0 v6wildcard [::]:123
Sep 15 11:50:00 opnsense ntpd[82140]: Listen and drop on 1 v4wildcard 0.0.0.0:123
Sep 15 11:50:00 opnsense ntpd[82140]: Listen normally on 2 vmx0 [fe80::250:56ff:fe8d:3fd0%1]:123
Sep 15 11:50:00 opnsense ntpd[82140]: Listen normally on 3 vmx0 82.246.46.14:123
Sep 15 11:50:00 opnsense ntpd[82140]: Listen normally on 4 vmx0 [2a01:e35:2f62:e0e0:250:56ff:fe8d:3fd0]:123
Sep 15 11:50:00 opnsense ntpd[82140]: Listen normally on 5 vmx1 192.168.1.1:123
Sep 15 11:50:00 opnsense ntpd[82140]: Listen normally on 6 vmx1 [fe80::1:1%2]:123
Sep 15 11:50:00 opnsense ntpd[82140]: Listen normally on 7 vmx2 192.168.1.240:123
Sep 15 11:50:00 opnsense ntpd[82140]: Listen normally on 8 vmx2 [fe80::250:56ff:fe8d:23a9%3]:123
Sep 15 11:50:00 opnsense ntpd[82140]: Listen normally on 9 lo0 127.0.0.1:123
Sep 15 11:50:00 opnsense ntpd[82140]: Listen normally on 10 lo0 [::1]:123
Sep 15 11:50:00 opnsense ntpd[82140]: Listen normally on 11 lo0 [fe80::1%7]:123
Sep 15 11:50:00 opnsense ntpd[82140]: Listening on routing socket on fd #32 for interface updates
I can't see anything different in those log entries, is there somewhere that an error would show?
-
Maybe the service probing from the GUI is simply broken. It is persistently showing ntpd as stopped? What happens when you type:
# pgrep ntpd
-
That command shows the following:
pgrep ntpd
98395
-
So it's running, but won't find the service in the GUI. Let me check the code to see if it does anything weird.
-
So it's running, but won't find the service in the GUI. Let me check the code to see if it does anything weird.
Don't forget that this 'appears' to be an occasional problem. After reverting my attempt at fixing it earlier the daemon started without problems at the following boot. In fact, I've just rebooted it again and the daemon is running OK. It's not really a major problem for me so if you can't find any problems in the code, what diagnostic information would you like if this happens again?
-
Hi Bill,
no further diagnostics info needed for now. I saw the same thing happening on my box, so I improved the service probing slightly:
https://github.com/opnsense/core/commit/e8a19ee303f1c47b1eacbf56565be66248bf557e
This will be in 15.7.14, maybe you can keep an eye on it if it is the same or better now?
Cheers,
Franco
-
Hi Franco
Yes, I'll keep an eye on it and see what happens and let you know if anything goes awry. :)
-
Hi Franco
I updated to the latest release 15.7.14 today and rebooted without problems, the NTP service shows as running and also after a couple of further reboots. This evening I rebooted again and the service shows as stopped but is running:
pgrep ntpd
7757
Any specific info you need for this?
[EDIT]... and a reload of the UI while I was viewing some settings and the NTP daemon is now showing running. I think this is one of these very low-level priority bugs that doesn't matter to me now I know that it's just a UI problem. :)
-
The only thing I can think of is that it's really going down for a few seconds I've been looking in the wrong location. See, we learned something. :)
-
The only thing I can think of is that it's really going down for a few seconds I've been looking in the wrong location. See, we learned something. :)
It's always good to learn something. :) I'll keep an eye on it see if it's just a momentary problem and leave it at that unless it does actually stop for any period. Thanks for your input on this.
-
I just now noticed nptd not being up after applying a WAN interface configuration. The pid changes and since the startup is delayed it looks like it's down for a few seconds. Can you confirm this?
-
My recollection is that yesterday the NTP daemon going down was about the time I was modifying something on the WAN interface, unfortunately I can't seem to reproduce it by doing anything to the WAN interface today.
-
Ok, the plot thickens.... I'll keep this under observation though it probably doesn't hurt at all. Just a bit iffy. :)
-
I think "iffy" is a good word for it. I was making some more changes yesterday evening and I did notice that the daemon wasn't showing started after each reboot (about four, in total) but a refresh of the UI then showed it as running. As I say, it's not a problem for me.
-
Ok, marked as [IFFY]. That should do it for now. :D