OPNsense Forum

Archive => 18.7 Legacy Series => Topic started by: bringha on September 28, 2018, 10:19:08 pm

Title: Upgrade to 18.7.4: telegraf plugin broken?
Post by: bringha on September 28, 2018, 10:19:08 pm
Hi there,

after I upgraded to 18.7.4. I noticed that the telegraph plugin seems to be broken, mainly due to the input.systems module; I have no suddenly two log files, one in /var/log/telegraf/telegraf.log and one in /var/log/telegraf.log. Although the config says
Code: [Select]
[global_tags]

[agent]
  interval = "10s"
  round_interval = false
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_jitter = "0s"
  precision = ""
  debug = false
  quiet = true
  logfile = "/var/log/telegraf.log"
  hostname = "opnsense"
  omit_hostname = false

[[outputs.influxdb]]
  urls = ["http://192.168.1.205:8086"]
  database = "telegraf"
  retention_policy = ""
  write_consistency = "any"
  timeout = "5s"
  username = "influx"
  password = "XXXXXXXXXX"




[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false

[[inputs.disk]]
  mount_points = ["/"]

[[inputs.diskio]]

[[inputs.mem]]

[[inputs.processes]]


[[inputs.system]]

[[inputs.net]]
that /var/log/telegraf.log shall be used, it uses the other one and writes tons of messages like
Code: [Select]
2018-09-28T19:20:23Z E! Error in plugin [inputs.system]: open /var/run/utmp: no such file or directory
2018-09-28T19:20:33Z E! Error in plugin [inputs.system]: open /var/run/utmp: no such file or directory
2018-09-28T19:20:43Z E! Error in plugin [inputs.system]: open /var/run/utmp: no such file or directory
2018-09-28T19:20:53Z E! Error in plugin [inputs.system]: open /var/run/utmp: no such file or directory
2018-09-28T19:21:03Z E! Error in plugin [inputs.system]: open /var/run/utmp: no such file or directory
2018-09-28T19:21:13Z E! Error in plugin [inputs.system]: open /var/run/utmp: no such file or directory
2018-09-28T19:21:23Z E! Error in plugin [inputs.system]: open /var/run/utmp: no such file or directory
2018-09-28T19:21:33Z E! Error in plugin [inputs.system]: open /var/run/utmp: no such file or directory
2018-09-28T19:21:43Z E! Error in plugin [inputs.system]: open /var/run/utmp: no such file or directory
2018-09-28T19:21:53Z E! Error in plugin [inputs.system]: open /var/run/utmp: no such file or directory
2018-09-28T19:22:03Z E! Error in plugin [inputs.system]: open /var/run/utmp: no such file or directory
E! Unable to append to /var/log/telegraf.log (open /var/log/telegraf.log: permission denied), using stderr
/var/log/telegraf.log belongs now root:root but should root:telegraf (?),

The non-nice side effect is that opnsense throughput in downlink drops to <5% of the normal performance.

Br br
Title: Re: Upgrade to 18.7.4: telegraf plugin broken?
Post by: mimugmail on September 29, 2018, 07:35:19 am
Can you try to revert telegraf pkg (not the plugin) to the old version from 18.7.3?
Title: Re: Upgrade to 18.7.4: telegraf plugin broken?
Post by: bringha on September 29, 2018, 08:52:27 am
Err - would love to but never done that before

Is it  'opnsense-revert -r 18.7.3 telegraf'

Br br
Title: Re: Upgrade to 18.7.4: telegraf plugin broken?
Post by: mimugmail on September 29, 2018, 10:21:38 am
Think so, only mobile today

https://docs.opnsense.org/manual/opnsense_tools.html
Title: Re: Upgrade to 18.7.4: telegraf plugin broken?
Post by: bringha on September 30, 2018, 02:48:18 pm
Well, the utmp error message disappeared, however still showing the error message
Code: [Select]
E! Unable to append to /var/log/telegraf.log (open /var/log/telegraf.log: permission denied), using stderr
E! Unable to append to /var/log/telegraf.log (open /var/log/telegraf.log: permission denied), using stderr
changed /var/log/telegraf to telegraf:telegraf, even there no change. Also the problem persists that throughput drops dramatically. have now temporarily switched off telegraf.

Note: we use telegraf in a larger cloud environment and observed some times that when telegraf wants to access files but cannot due to permission, CPU load on this node rises to 100% and machine more or less stops productive work ...

However, could it be that with the recent kernel/sys upgrade, utmp was replaced by some more modern utx? At least the man page on my sense indicate so .... Would then need some adaption in telegraf too ....

Br br
Title: Re: Upgrade to 18.7.4: telegraf plugin broken?
Post by: bringha on October 01, 2018, 11:16:30 pm
OK - here we are with the issue:

There is a conflicting config in telegraf with the log files:

Via the GUI, telegraf is advised to write the log file /var/log/telegraf.log (see above), this is written to the telegraf config file

However, the start script for telegraf in /usr/local/etc/rc.d/telegraf configures:

Code: [Select]
(...)
name="telegraf"
rcvar=telegraf_enable
load_rc_config $name

: ${telegraf_enable:="NO"}
: ${telegraf_user:="telegraf"}
: ${telegraf_group:="telegraf"}
: ${telegraf_flags:="-quiet"}
: ${telegraf_conf:="/usr/local/etc/${name}.conf"}
: ${telegraf_options:="${telegraf_flags} -config=${telegraf_conf}"}

logfile="/var/log/telegraf/${name}.log"
pidfile="/var/run/${name}.pid"
command=/usr/sbin/daemon
start_precmd="telegraf_prestart"
start_cmd="telegraf_start"
stop_cmd="telegraf_stop"
(...)

this causes a conflict where to write ...

Only either or is possible ....

Solution would be to leave the log file entry in telegraf.conf

Code: [Select]
(...)
[agent]
  interval = "10s"
  round_interval = false
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_jitter = "0s"
  precision = ""
  debug = false
  quiet = true
  logfile = ""    <--- leave empty
  hostname = "opnsense"
  omit_hostname = false
(...)

As I assume that this is again created automatically this requires change in the code

Br br
Title: Re: Upgrade to 18.7.4: telegraf plugin broken?
Post by: mimugmail on October 02, 2018, 09:40:21 am
Thanks for all infos .. seems with the jump from telegraf 1.6.X to 1.7.X this was changed in rc script.
I'll try to fix this with the next version ..
Title: Re: Upgrade to 18.7.4: telegraf plugin broken?
Post by: bringha on October 02, 2018, 10:33:41 am
Thanks, thats great!

Br br
Title: Re: Upgrade to 18.7.4: telegraf plugin broken?
Post by: franco on October 02, 2018, 04:49:13 pm
Yup, will be in 18.7.5 next week...


Cheers,
Franco