OPNsense Forum

Archive => 21.1 Legacy Series => Topic started by: andrema2 on July 25, 2021, 02:49:47 pm

Title: Error msg starting Telegraf 1.19.0
Post by: andrema2 on July 25, 2021, 02:49:47 pm
Hi

I start to see these error msgs when Telegraf is starting. Telegraf is writing its data to a Influxdb 1.8.6   


time="2021-07-25T09:32:00-03:00" level=error msg="failed to open. Ignored. open /.cache/snowflake/ocsp_response_cache.json: no such file or directory\n" func="gosnowflake.(*defaultLogger).Errorf" file="log.go:120"

time="2021-07-25T09:32:00-03:00" level=error msg="failed to create cache directory. /.cache/snowflake, err: mkdir /.cache: permission denied. ignored\n" func="gosnowflake.(*defaultLogger).Errorf" file="log.go:120"

Seems that the information is been written with the exception of unbound that I normally add in the unbound.inc file.

Which is getting this error
   E! [inputs.unbound] Error in plugin: error gathering metrics: error running unbound-control: exit status 1 (/usr/local/sbin/unbound-control [stats_noreset])

In the past I had the same issue. What I did to solve the issue was:

ln -s /var/unbound/unbound.conf /usr/local/etc/unbound/unbound.conf
add /usr/sbin/pw groupmod unbound -m telegraf to /usr/local/opnsense/scripts/OPNsense/Telegraf/setup.sh

and at /usr/local/opnsense/service/templates/OPNsense/Telegraf/telegraf.conf
[[inputs.unbound]]
   binary = "/usr/local/sbin/unbound-control"
   thread_as_tag = true

It was working before the telegraf version upgrade

If I test telegraf with:
/usr/local/bin/telegraf --config /usr/local/etc/telegraf.conf --test --input-filter unbound

It works.

I appreciate any help
Title: Re: Error msg starting Telegraf 1.19.0
Post by: andrema2 on July 27, 2021, 08:31:54 pm
Am I the only one with this problem ? I thought more people would be using it.

If you are and it's working would you mind share your telegraf config or how did you do it ?
Title: Re: Error msg starting Telegraf 1.19.0
Post by: franco on July 27, 2021, 09:38:19 pm
Your luck is waiting for you at GitHub: https://github.com/opnsense/plugins/issues/new?assignees=&labels=&template=feature_request.md&title=

It's difficult to track and discuss multiple required code changes in a forum post...


Cheers,
Franco
Title: Re: Error msg starting Telegraf 1.19.0
Post by: mimugmail on July 28, 2021, 09:45:14 am
The error comes on startup only and is not related to Unbound if enabled or not.
For me, Telegraf works just fine. (I dont use Unbound input which is not related to the error)
Title: Re: Error msg starting Telegraf 1.19.0
Post by: andrema2 on July 30, 2021, 06:17:38 pm
The error comes on startup only and is not related to Unbound if enabled or not.
For me, Telegraf works just fine. (I dont use Unbound input which is not related to the error)

Would be too much to ask you to add the Unbound input just to test it ? I know you are one of the gurus here so maybe you see something that I didn't...
Title: Re: Error msg starting Telegraf 1.19.0
Post by: mimugmail on July 31, 2021, 03:22:16 pm
I'm working on a patch ..
Title: Re: Error msg starting Telegraf 1.19.0
Post by: mimugmail on August 01, 2021, 08:34:48 pm
https://github.com/opnsense/plugins/pull/2488
Title: Re: Error msg starting Telegraf 1.19.0
Post by: andrema2 on August 03, 2021, 10:04:59 am
https://github.com/opnsense/plugins/pull/2488

How should I download and test it ? Dumb question I know. I tried to update the plugins but it did show.
Title: Re: Error msg starting Telegraf 1.19.0
Post by: mimugmail on August 03, 2021, 10:45:33 am
It'll take one or two releases to go to stable.

Title: Re: Error msg starting Telegraf 1.19.0
Post by: andrema2 on August 06, 2021, 07:37:25 pm
It'll take one or two releases to go to stable.



Hi Mimugmail, it in got in this one.

Nevertheless, if I enable unbound it generates the same error than before.

   E! [inputs.unbound] Error in plugin: error gathering metrics: error running unbound-control: exit status 1 (/usr/local/sbin/unbound-control [-c /var/unbound/unbound.conf stats_noreset])

My troubleshooting steps so far was, remove the plugin. I have look around and I didn't find any telegraf file once I remove the plugin. I used find / -name telegraf. I delete anything that was left. I also deleted the user and group telegraf.
Reinstall the plugin. All settings from before appeared checked. So probably it is in a different place, but I'm assuming it does mean much.

As soon as I start Telegraf it generates the error above. I tried with and without the wheel group set.

Any other ideas ?

Title: Re: Error msg starting Telegraf 1.19.0
Post by: mimugmail on August 06, 2021, 10:47:10 pm
Did you also enable wheel group?
Title: Re: Error msg starting Telegraf 1.19.0
Post by: andrema2 on August 07, 2021, 12:28:41 am
Did you also enable wheel group?

Yes, I did. Both on and off.

With it on and ssh to the firewall and checked if the telegraf user had wheel as one of its group. It didn’t. I tried to add, but it didn’t sobre the problem
Title: Re: Error msg starting Telegraf 1.19.0
Post by: mimugmail on August 07, 2021, 09:34:42 am
For me it also works only a couple of times, maybe with restaring Unbound or OPN itself
Title: Re: Error msg starting Telegraf 1.19.0
Post by: andrema2 on August 07, 2021, 02:59:21 pm
For me it also works only a couple of times, maybe with restaring Unbound or OPN itself

I just tried a reboot on both my systems (primary and backup) same error.

One thing that is bothering me is that the logs shows time of the error 3 hours ahead of me, which is UTC. My system is set to Americas/Sao_Paulo. Does it help in the troubleshoot ?

Is it possible to rollback to a version before 1.19 of the telegraf plugin ?
Title: Re: Error msg starting Telegraf 1.19.0
Post by: andrema2 on August 07, 2021, 03:30:02 pm
So below is from a thread in netgate forum that I used to make it work in the past until 1.19 came along. It also might give a hint

Code: [Select]
uhm i have the same error on my 2.5.0 now.. but idk .. anyway
i was reading this
https://github.com/influxdata/telegraf/tree/master/plugins/inputs/unbound

## The default location of the unbound config file can be overridden with:
# config_file = "/etc/unbound/unbound.conf"

but this does not work because config_file is not defined inside pfsense plugin

binary = "/usr/local/sbin/unbound-control"

this work but if you add -c /var/unbound/unbound.conf it's unable to execute the command

the only workaround i found is to create a symlink inside the unbound-control default directory...

-c file       config file, default is /usr/local/etc/unbound/unbound.conf

the workaround is as follow:

rm /usr/local/etc/unbound/unbound.conf
ln -s /var/unbound/unbound.conf /usr/local/etc/unbound/unbound.conf

inside
telegraf on pfsense
Additional configuration for Telegraf:

[[inputs.unbound]]
binary = "/usr/local/sbin/unbound-control"
thread_as_tag = false

to test the plugin:

/usr/local/bin/telegraf -config=/usr/local/etc/telegraf.conf --test --input-filter unbound
2020-04-03T20:57:16Z I! Starting Telegraf 1.13.4
> unbound,host=pfSense.kiokoman.home infra_cache_count=0,key_cache_count=0,mem_cache_message=66072,mem_cache_rrset=66072,mem_mod_iterator=1bla bla bla
Title: Re: Error msg starting Telegraf 1.19.0
Post by: andrema2 on August 12, 2021, 02:48:35 am
For me it also works only a couple of times, maybe with restaring Unbound or OPN itself
So, I downloaded version 1.19.2 of telegraf. Still no luck. I also added telegraf user to admins group, it didn’t change the results.
I’m still trying
Title: Re: Error msg starting Telegraf 1.19.0
Post by: mimugmail on September 03, 2021, 02:46:55 pm
I checked back and I got metrics for the last 2 days .. I now added also userprocess "root" for telegraf .. maybe this help. I believe with your multiple tries you were in a state where nothing works?
Title: Re: Error msg starting Telegraf 1.19.0
Post by: andrema2 on September 03, 2021, 04:04:12 pm
Hi,

I think I was able to create a workaround. Not the safest one...

in the /usr/local/etc/rc.d/telegraf instead of running the daemon as telegraf user I changed to run with the root user.

I don't have any errors now and the data is being saved on Influx as I wanted.

Still not clear why the telegraf user can't run telegraf on daemon mode, but it's certainly related to it.
Title: Re: Error msg starting Telegraf 1.19.0
Post by: mimugmail on September 03, 2021, 05:17:38 pm
Yep, when merged this will be the default when wheel group checkbox is ticked