OPNsense Forum
Archive => 21.1 Legacy Series => Topic started by: andrema2 on July 25, 2021, 02:49:47 pm
-
Hi
I start to see these error msgs when Telegraf is starting. Telegraf is writing its data to a Influxdb 1.8.6
time="2021-07-25T09:32:00-03:00" level=error msg="failed to open. Ignored. open /.cache/snowflake/ocsp_response_cache.json: no such file or directory\n" func="gosnowflake.(*defaultLogger).Errorf" file="log.go:120"
time="2021-07-25T09:32:00-03:00" level=error msg="failed to create cache directory. /.cache/snowflake, err: mkdir /.cache: permission denied. ignored\n" func="gosnowflake.(*defaultLogger).Errorf" file="log.go:120"
Seems that the information is been written with the exception of unbound that I normally add in the unbound.inc file.
Which is getting this error
E! [inputs.unbound] Error in plugin: error gathering metrics: error running unbound-control: exit status 1 (/usr/local/sbin/unbound-control [stats_noreset])
In the past I had the same issue. What I did to solve the issue was:
ln -s /var/unbound/unbound.conf /usr/local/etc/unbound/unbound.conf
add /usr/sbin/pw groupmod unbound -m telegraf to /usr/local/opnsense/scripts/OPNsense/Telegraf/setup.sh
and at /usr/local/opnsense/service/templates/OPNsense/Telegraf/telegraf.conf
[[inputs.unbound]]
binary = "/usr/local/sbin/unbound-control"
thread_as_tag = true
It was working before the telegraf version upgrade
If I test telegraf with:
/usr/local/bin/telegraf --config /usr/local/etc/telegraf.conf --test --input-filter unbound
It works.
I appreciate any help
-
Am I the only one with this problem ? I thought more people would be using it.
If you are and it's working would you mind share your telegraf config or how did you do it ?
-
Your luck is waiting for you at GitHub: https://github.com/opnsense/plugins/issues/new?assignees=&labels=&template=feature_request.md&title=
It's difficult to track and discuss multiple required code changes in a forum post...
Cheers,
Franco
-
The error comes on startup only and is not related to Unbound if enabled or not.
For me, Telegraf works just fine. (I dont use Unbound input which is not related to the error)
-
The error comes on startup only and is not related to Unbound if enabled or not.
For me, Telegraf works just fine. (I dont use Unbound input which is not related to the error)
Would be too much to ask you to add the Unbound input just to test it ? I know you are one of the gurus here so maybe you see something that I didn't...
-
I'm working on a patch ..
-
https://github.com/opnsense/plugins/pull/2488
-
https://github.com/opnsense/plugins/pull/2488
How should I download and test it ? Dumb question I know. I tried to update the plugins but it did show.
-
It'll take one or two releases to go to stable.
-
It'll take one or two releases to go to stable.
Hi Mimugmail, it in got in this one.
Nevertheless, if I enable unbound it generates the same error than before.
E! [inputs.unbound] Error in plugin: error gathering metrics: error running unbound-control: exit status 1 (/usr/local/sbin/unbound-control [-c /var/unbound/unbound.conf stats_noreset])
My troubleshooting steps so far was, remove the plugin. I have look around and I didn't find any telegraf file once I remove the plugin. I used find / -name telegraf. I delete anything that was left. I also deleted the user and group telegraf.
Reinstall the plugin. All settings from before appeared checked. So probably it is in a different place, but I'm assuming it does mean much.
As soon as I start Telegraf it generates the error above. I tried with and without the wheel group set.
Any other ideas ?
-
Did you also enable wheel group?
-
Did you also enable wheel group?
Yes, I did. Both on and off.
With it on and ssh to the firewall and checked if the telegraf user had wheel as one of its group. It didn’t. I tried to add, but it didn’t sobre the problem
-
For me it also works only a couple of times, maybe with restaring Unbound or OPN itself
-
For me it also works only a couple of times, maybe with restaring Unbound or OPN itself
I just tried a reboot on both my systems (primary and backup) same error.
One thing that is bothering me is that the logs shows time of the error 3 hours ahead of me, which is UTC. My system is set to Americas/Sao_Paulo. Does it help in the troubleshoot ?
Is it possible to rollback to a version before 1.19 of the telegraf plugin ?
-
So below is from a thread in netgate forum that I used to make it work in the past until 1.19 came along. It also might give a hint
uhm i have the same error on my 2.5.0 now.. but idk .. anyway
i was reading this
https://github.com/influxdata/telegraf/tree/master/plugins/inputs/unbound
## The default location of the unbound config file can be overridden with:
# config_file = "/etc/unbound/unbound.conf"
but this does not work because config_file is not defined inside pfsense plugin
binary = "/usr/local/sbin/unbound-control"
this work but if you add -c /var/unbound/unbound.conf it's unable to execute the command
the only workaround i found is to create a symlink inside the unbound-control default directory...
-c file config file, default is /usr/local/etc/unbound/unbound.conf
the workaround is as follow:
rm /usr/local/etc/unbound/unbound.conf
ln -s /var/unbound/unbound.conf /usr/local/etc/unbound/unbound.conf
inside
telegraf on pfsense
Additional configuration for Telegraf:
[[inputs.unbound]]
binary = "/usr/local/sbin/unbound-control"
thread_as_tag = false
to test the plugin:
/usr/local/bin/telegraf -config=/usr/local/etc/telegraf.conf --test --input-filter unbound
2020-04-03T20:57:16Z I! Starting Telegraf 1.13.4
> unbound,host=pfSense.kiokoman.home infra_cache_count=0,key_cache_count=0,mem_cache_message=66072,mem_cache_rrset=66072,mem_mod_iterator=1bla bla bla
-
For me it also works only a couple of times, maybe with restaring Unbound or OPN itself
So, I downloaded version 1.19.2 of telegraf. Still no luck. I also added telegraf user to admins group, it didn’t change the results.
I’m still trying
-
I checked back and I got metrics for the last 2 days .. I now added also userprocess "root" for telegraf .. maybe this help. I believe with your multiple tries you were in a state where nothing works?
-
Hi,
I think I was able to create a workaround. Not the safest one...
in the /usr/local/etc/rc.d/telegraf instead of running the daemon as telegraf user I changed to run with the root user.
I don't have any errors now and the data is being saved on Influx as I wanted.
Still not clear why the telegraf user can't run telegraf on daemon mode, but it's certainly related to it.
-
Yep, when merged this will be the default when wheel group checkbox is ticked