Error msg starting Telegraf 1.19.0

Started by andrema2, July 25, 2021, 02:49:47 PM

Previous topic - Next topic
July 25, 2021, 02:49:47 PM Last Edit: July 25, 2021, 09:49:20 PM by andrema2
Hi

I start to see these error msgs when Telegraf is starting. Telegraf is writing its data to a Influxdb 1.8.6   


time="2021-07-25T09:32:00-03:00" level=error msg="failed to open. Ignored. open /.cache/snowflake/ocsp_response_cache.json: no such file or directory\n" func="gosnowflake.(*defaultLogger).Errorf" file="log.go:120"

time="2021-07-25T09:32:00-03:00" level=error msg="failed to create cache directory. /.cache/snowflake, err: mkdir /.cache: permission denied. ignored\n" func="gosnowflake.(*defaultLogger).Errorf" file="log.go:120"

Seems that the information is been written with the exception of unbound that I normally add in the unbound.inc file.

Which is getting this error
   E! [inputs.unbound] Error in plugin: error gathering metrics: error running unbound-control: exit status 1 (/usr/local/sbin/unbound-control [stats_noreset])

In the past I had the same issue. What I did to solve the issue was:

ln -s /var/unbound/unbound.conf /usr/local/etc/unbound/unbound.conf
add /usr/sbin/pw groupmod unbound -m telegraf to /usr/local/opnsense/scripts/OPNsense/Telegraf/setup.sh

and at /usr/local/opnsense/service/templates/OPNsense/Telegraf/telegraf.conf
[[inputs.unbound]]
   binary = "/usr/local/sbin/unbound-control"
   thread_as_tag = true

It was working before the telegraf version upgrade

If I test telegraf with:
/usr/local/bin/telegraf --config /usr/local/etc/telegraf.conf --test --input-filter unbound

It works.

I appreciate any help

Am I the only one with this problem ? I thought more people would be using it.

If you are and it's working would you mind share your telegraf config or how did you do it ?

Your luck is waiting for you at GitHub: https://github.com/opnsense/plugins/issues/new?assignees=&labels=&template=feature_request.md&title=

It's difficult to track and discuss multiple required code changes in a forum post...


Cheers,
Franco

The error comes on startup only and is not related to Unbound if enabled or not.
For me, Telegraf works just fine. (I dont use Unbound input which is not related to the error)

Quote from: mimugmail on July 28, 2021, 09:45:14 AM
The error comes on startup only and is not related to Unbound if enabled or not.
For me, Telegraf works just fine. (I dont use Unbound input which is not related to the error)

Would be too much to ask you to add the Unbound input just to test it ? I know you are one of the gurus here so maybe you see something that I didn't...



August 03, 2021, 10:04:59 AM #7 Last Edit: August 03, 2021, 10:11:36 AM by andrema2
Quote from: mimugmail on August 01, 2021, 08:34:48 PM
https://github.com/opnsense/plugins/pull/2488

How should I download and test it ? Dumb question I know. I tried to update the plugins but it did show.


August 06, 2021, 07:37:25 PM #9 Last Edit: August 06, 2021, 09:01:12 PM by andrema2
Quote from: mimugmail on August 03, 2021, 10:45:33 AM
It'll take one or two releases to go to stable.



Hi Mimugmail, it in got in this one.

Nevertheless, if I enable unbound it generates the same error than before.

   E! [inputs.unbound] Error in plugin: error gathering metrics: error running unbound-control: exit status 1 (/usr/local/sbin/unbound-control [-c /var/unbound/unbound.conf stats_noreset])

My troubleshooting steps so far was, remove the plugin. I have look around and I didn't find any telegraf file once I remove the plugin. I used find / -name telegraf. I delete anything that was left. I also deleted the user and group telegraf.
Reinstall the plugin. All settings from before appeared checked. So probably it is in a different place, but I'm assuming it does mean much.

As soon as I start Telegraf it generates the error above. I tried with and without the wheel group set.

Any other ideas ?



Quote from: mimugmail on August 06, 2021, 10:47:10 PM
Did you also enable wheel group?

Yes, I did. Both on and off.

With it on and ssh to the firewall and checked if the telegraf user had wheel as one of its group. It didn't. I tried to add, but it didn't sobre the problem

For me it also works only a couple of times, maybe with restaring Unbound or OPN itself

Quote from: mimugmail on August 07, 2021, 09:34:42 AM
For me it also works only a couple of times, maybe with restaring Unbound or OPN itself

I just tried a reboot on both my systems (primary and backup) same error.

One thing that is bothering me is that the logs shows time of the error 3 hours ahead of me, which is UTC. My system is set to Americas/Sao_Paulo. Does it help in the troubleshoot ?

Is it possible to rollback to a version before 1.19 of the telegraf plugin ?

So below is from a thread in netgate forum that I used to make it work in the past until 1.19 came along. It also might give a hint

uhm i have the same error on my 2.5.0 now.. but idk .. anyway
i was reading this
https://github.com/influxdata/telegraf/tree/master/plugins/inputs/unbound

## The default location of the unbound config file can be overridden with:
# config_file = "/etc/unbound/unbound.conf"

but this does not work because config_file is not defined inside pfsense plugin

binary = "/usr/local/sbin/unbound-control"

this work but if you add -c /var/unbound/unbound.conf it's unable to execute the command

the only workaround i found is to create a symlink inside the unbound-control default directory...

-c file       config file, default is /usr/local/etc/unbound/unbound.conf

the workaround is as follow:

rm /usr/local/etc/unbound/unbound.conf
ln -s /var/unbound/unbound.conf /usr/local/etc/unbound/unbound.conf

inside
telegraf on pfsense
Additional configuration for Telegraf:

[[inputs.unbound]]
binary = "/usr/local/sbin/unbound-control"
thread_as_tag = false

to test the plugin:

/usr/local/bin/telegraf -config=/usr/local/etc/telegraf.conf --test --input-filter unbound
2020-04-03T20:57:16Z I! Starting Telegraf 1.13.4
> unbound,host=pfSense.kiokoman.home infra_cache_count=0,key_cache_count=0,mem_cache_message=66072,mem_cache_rrset=66072,mem_mod_iterator=1bla bla bla