Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - wstemb

#1
Resolved.

It was the trailing whitespace in the myspace.prom file to cause the error.

I killed the node_exporter process and restarted it from CLI. After the starting lines, I got the error:

Quotets=2025-02-06T11:14:06.216Z caller=textfile.go:245 level=error collector=textfile msg="failed to collect textfile data" file=myfile.prom err="failed to parse textfile data from \"/var/tmp/node_exporter/myfile.prom\": text format parsing error in line 3: expected integer as timestamp, got \"\""

Searching the web, I found: https://github.com/prometheus/common/issues/33. In short: the node-exporter textfile parser does not tolerate trailing whitespaces. 
#2
I installed the os_node_exporter plugin, and it is working, it is serving data from opnsense to prometheus and grafana.

The problem (or my lack of knowledge) is that, although the flag "--collector.textfile.directory=/var/tmp/node_exporter" is here (as seen from "ps aux | grep node"), the plugin is not reading nor including content of *.prom files placed in this directory:

Quote/var/tmp/node_exporter # ls -al
total 12
drwxr-xr-t  2 nobody nobody 512 Feb  5 13:21 .
drwxrwxrwt  6 root   wheel  512 Feb  5 13:05 ..
-rw-r--r--  1 nobody nobody 521 Feb  5 12:39 myfile.prom

When I look at the http://fw_IP_ADDR:9100/metrics, the rows from myfile.prom are not here, instead I can find: 

Quote# HELP node_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
# TYPE node_scrape_collector_duration_seconds gauge
...
node_scrape_collector_duration_seconds{collector="textfile"} 0.000197219
...
# HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
# TYPE node_scrape_collector_success gauge
...
node_scrape_collector_success{collector="textfile"} 1
...
# HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
# TYPE node_textfile_scrape_error gaugedata
node_textfile_scrape_error 1

Same behaviour on 24.7 or 25.1 firewall.

So there must be some read errors.
As seen in upper quote, the file is inside the /var/tmp/node_exporter directory, it has the prom extension and i changed owner to nobody:nobody since the process is working under this user. The documentation on plugin github is weak about this, somebody has some advice? 
#3
As  proposed in the support documentation link in previous post, use the "Have Feedback"  link bottom left on Zenarmor Web Gui. You will have the opportunity to check the mark to add logs (and to see the position in filesystem).

If you want to see them, you must go to CLI (/usr/local/zenarmor/log)
#4
DISCLAIMER: what I  write in next rows is not a solution, it is a brute force workaround for just one fixed scenario (SMTP server, No security) if you desperately need  the mail report.

OPNsense 23.7.12_5
Zenarmor 1.16.2

Edit send.py and comment around line 246:

#       if password:
#           smtp.login(username, password)

to avoid bug 2 from previous post.

Then you MUST choose (if/when posible and applicable):
Mail provider: SMTP Server
Mail server hostname: Hostname  or IP of a server without authentication
Connection Security: No Security

Mail server port will change to Port 25

You have to put some dummy data for username and password to avoid the bug 1 from previous post.
#5
I sent a feedback/bug report following the instructions: https://www.zenarmor.com/docs/support/reporting-bug
#6
Yes, the maintain and upgrade process has some rules. 

I changed the script a little (two lines), to make it work just for me in one strict scenario (No security). It is a brute force approach,  my copy will be probably overwritten soon by upgrading Zenarmor.
#7
Found the same error using the local mail server and investigated it a little:

The script cannot be run just as is in the post, the command must receive arguments from the caller, your error is because of lack of arguments.

usage: send.py [-h] [-b PDF] [-S SERVER] [-R PROVIDER] [-P PORT] [-s SECURED] [-u USERNAME] [-p PASSWORD] [-f SENDER] [-t TO] [-v NOSSLVERIFY]

Bug no  1.: When you use it on plain  SMTP with No security (without userid and password), the switches --userid and --password are still in the command without arguments:

added echo $@ > filename in the script:

--provider smtp-server --pdf false --server a.b.c.d --port 25 --secured true --username --password --sender i@am.here --to you@are.there --nosslverify false

producing a send.py error message:   

end.py: error: argument -u/--username: expected one argument
or
send.py: error: argument -p/--password: expected one argument  - if you put userid and empty password

and the error in GUI: Error (200) There was an issue on our end. Sorry about that.


bug no 2:

If you stay with plain SMTP on port 25, and No security and put some  userid/password data to bypass the bug no.1,  the script send.sh passes valid data to send.py, which wrongly answers with:

{"successful": false, "message": "Smtp :No suitable authentication method found."}

Which authentication methods? -  I am using plain SMTP on port 25, No security! 

What works?


When in CLI I run the script send.sh with all switches, except --userid and --password, the result of the script (send.py called by send.sh) is:

{"successful": true, "message": "Mail has been send successfully!"}

and test mail is sent and received.

Conclusion:
the script send.py has to be rewritten with  better argument parsing:
a) permiting the empty --userid --and password;
b) dropping them as parameters if the Connection security is "No Security".  Now the script wrongly assumes that the existing password in arguments  means login, plain SMTP with No security do not need  login.

I needed the report to work, so I hard-coded a little the script send.py to make it work in the simplest scenario (No security), but I stilll have to check with STARTTLS. 

Where is the PDF check un the GUI?
#8
Zenarmor (Sensei) / Re: Local vs Remote confusion
October 18, 2023, 09:11:28 AM
https://forum.opnsense.org/index.php?topic=33270.msg160924#msg160924

Graphs were reverted, but the drill down filters  were working only if manually changed in right (expected, not offered) way.

Since the issue was visible only in passive mode and  disappeared using routed mode, I did not work on it last months. 

Have to check if it is still present in the new versions/releases of Zenarmor.
#9
Quote from: tessus on June 14, 2023, 05:00:52 AM

I have no way to assign the interface IDs myself, since they are chosen automatically when creating an interface. There is no way to do that manually.

It is possible, a boring process of manually defining interfaces on second firewall one by one following the order OPT1 -> OPT23.  But it can be done in less than an hour on this number of interfaces.
Quote

Unless a restore keeps the same assignments, this is impossible. Otherwise a backup and restore should do the trick.

You can try it, backup the main, edit the xml (IP addresses and so on), restore on backup (new)
Quote

Here lies the issue. I have N (about 25) VLANs. This means I have to change 2xN interfaces and create N CARP VIP entries.

You have to define a new IP address on every interface on backup node (something that has to be done in any case), replace a IP address on every interface on main, define with previously used address new CARP VIPs on both nodes. So 4 actions per interface. Manual boring work again, but it can be done relatively fast.  Until you have the second firewall disabled, it can be done sequentially, in phases.

But, maybe it can be done, at least partially, editing the backup config xml file and restoring on main, combining part of interface config from  the main into the backup node config xml and restoring on backup. I preferred the manual work, where all was under control. 
Quote

Then I have to change all firewall rules, because the FW now has to use the virtual interfaces, which are using new interface IDs.

No, I did not touch any rules after building the HA, neither on main, neither on synchronized rules on backup. All was  working if you maintain cluster IP addresses = former FW addresses.  I had to change only the OpenVPN server Interface from WAN to the VIP CARP address of the WAN.
Quote

I also use OpenVPN (out) and Wireguard (in/out). I certainly would have to figure out how to make this work as well.

For OpenVPN in client access mode I will tell you later, I defined everything, services are working, but I have to check if failover is working (in some maintenance window time)
Quote
Quote from: wstemb on June 13, 2023, 09:13:51 AM
3. Defining the High Availability on main and second node, and defining all the synchronization (XMLRPC Sync) you need. This will copy the chosen definitions  to the second node.

Yes, this should not be too complicated.

Thanks for the link, but I actually had read that one before I posted this topic.

Unfortunately all this is a moot conversation unless there is an answer to my first question.
I can't be the only one who has a cable modem, can I? Additionally, anyone who uses OPNsense is most likely using the modem in bridged mode, so someone should have an answer to my question.

I have a simplest routing scenario on WAN with a standalone managed switch connecting the cluster nodes and the ISP router, using a small IP segment on WAN side and fixed IP for all nodes on this segment.

Can you specify better how the WAN definition is configured in bridge mode? I have not experience with cable modems, I had to use several years ago a ISP ADSL bridge/router configured as bridge, moved ASAP to router definition...

#10
it can be done and I done it (IPv4 only) and it is was a smooth, straightforward few hours manual work.

I have 6 real interfaces (including PFSYNC) and 13 VLAN-s on some of real interfaces on every node.  The firewall (which will become master) was in production for few months. I had to work "in place", since I was missing the third machine.

First, I made a IP address plan - 3 addresses per interface. The address on the "old", existing firewall have to become VIP addresses, other two are for nodes.

I manually reconstructed the interfaces on the new firewall (identical machine as the MASTER) , first the real ones, after that the VLANs, just following the order  of OPTxx interfaces. Where I had the gap in the numbering of OPT interfaces (just one, luckily) I defined one "placeholder", defined the next, after that I deleted the placeholder - few hour of non intrusive work, can be done whenever you want.  I had also to define manually the Virtual IP, OTHER type definitions on new firewall, since during the test I did not see copying them (just few of them, so it was easier to define them than solve the issue)

After that, I defined the VIPS one by one, changing the Master ipv4 IP to one reserved to the node, and moving  the old address to VIP. After that, I synchronized the backup with the master. I had no need to change any rule or NAT definition, Just the OpenVPN server interface address.

All work was done in two evenings, in the maintenance time window,  first day the backup switch trunk and VLAN definitions, IP address planing, testing, basic functionality and main interfaces, the second all the remaining.  In the meantime, the Backup node was disabled. All the time, on every step I made backups of configurations of both firewalls, to step back if needed.

I am working now on two last functions: OpenVPN client access (using internal CA :-( ), and FRR.

Probably there is a better way, but I had to do the work, I had deadlines.  So I done it manually this way, knowing that "The Better is the Enemy of the Good".
#11
I cannot answer you on the first question.

About the adding another node to a highly defined cluster without defining it from scratch, the first part (network and firewall topology) is possible:
1. You have to build another node with exact copy of interfaces as on first (exact means exact OPTx assignment, since OPTX definitions are used during the synchronization phase (copying the rules to second node).
2. Defining a new set of IP address on every pair of interfaces, defining CARP VIP on all interfaces with the IP address previously used on the single firewall interfaces (so yiu do not have to change Default gateways on the network nodes. 
3. Defining the High Availability on main and second node, and defining all the synchronization (XMLRPC Sync) you need. This will copy the chosen definitions  to the second node.

The guide https://www.thomas-krenn.com/en/wiki/OPNsense_HA_Cluster_configuration  is enough for this phase, if you extrapolate it to a more complex situation and if you maintain the OPTx order of interfaces.

I am working now on porting the OpenVPN to the cluster, so I cannot add anything on this.
#12
Work half done.

Installed a second firewall on a identical hardware and upgraded to same firmware version.

Defined all interfaces (I have a lot of them  8, most of them VLANs) ). Had to follow strictly the same order of OPTx names during definitions on the second firewall, if not the HA "Synchronize states" will copy definitions on wrong interfaces.

Defined corresponding CARP VIP-s on both firewalls  for all defined interfaces.

On first tests is seems all (defined) is working, but since the work is not finished and important functions have to be redefined - the most important are OpenVPN servers and OSPF definition, I disabled the second firewall for now, so the cluster is working on one node only.

I had to change the OpenVPN server interface to the cluster one on WAN.
#13
Thank you.

I am using now only IPV4, and plan to continue using it alone on cluster, so there will be no issues with IPV6

On a cluster of commercial FWs I had before, I did not use DHCP  in cluster mode. Both FW had enabled DHCP on selected interfaces, similar options, different scopes. The plan was to continue this way.

Zen armor is not a show stopper, it can be temporarily disabled/deinstalled if necessary during reconfiguration. OpenVPN is very important to continue to work for remote users, OSPF also, so here could arise new question.
#14
I did not find answers on this topic, only questions.

It is possible to build a cluster on top of already highly configured and working firewall?

Some interruption are acceptable, but I have not the window for the  "dismantle and rebuild work", at least without a very precise plan and timetable, where Murphy Law is governing our work. 

The first firewall machine is configured and working with native or vlan interfaces and rules, dhcp working on some of them, openvpn, zenarmor, routing (OSPF) and some other functions and plugins I can disable. 

The second firewall machine is identical in hardware and in basic OPNSense post-install settings  to the first working machine.

I have at least one free NIC  and enough free IP addressed on all interfaces, including WAN.

The OPNSense manual I found describe building the cluster from the scratch. I cannot afford this, because I have not a third identical machine to build a cluster and then reconfigure it following the existing firewall configuration in production.

Thanks, Walter

#15
Quote from: Rootfix on April 04, 2023, 04:17:20 PM
Hi,

In passive mode, Zenarmor uses the pcap instead of the netmap. It provides to get  a copy of the packets. So Zenarmor classify the traffic according to source and destination address. It can not know which one is local side and remote site. With the upcoming releases a configuration option will be available in passive mode to indicate the LAN and WAN interfaces, So, Zenarmor could be able to determine for local and remote IP addresses.

...


But I have the sensation there is no problem in [Source] or [Destination] categorization of address, just in the graph displaying it. If I drill down  on graph Local Hosts or Remote Hosts, it is just using wrong filter, corresponding to the graph, displaying nothing.

If I correct it placing the opposite filter, p.e.  [Source Hostname] filter for real internal (local)  addresses, displayed wrongly in "Remote Hosts" graph, the filter is displaying this local(source) machine  graphs correctly.

To be clear, when in routing mode, Zenarmor is protecting the LAN interface only (igb0). There is no allowed access from outside (WAN)  to hosts on LAN, just from some other isolated segments (DMZ like )

Walter