Upgrade 17.7.12_1>18.1 failure

Started by Taomyn, January 29, 2018, 09:48:45 PM

Previous topic - Next topic
Unfortunately I have to report a failure to successfully upgrade to v18.1 from 17.7.12_1


Everything appeared to go smoothly but I ended up with a firewall I could not leave running. Problems I had were:

       
  • Firewall service error preventing firewall starting up, had to disable this rule:
    There were error(s) loading the rules: /tmp/rules.debug:143: proto icmp doesn't match address family inet6 - The line in question reads [143]: pass in quick on pppoe0 reply-to ( pppoe0 fe80::eab7:48ff:fed9:a00 ) inet6 proto icmp from {any} to {any} keep state label "USER_RULE: Allow IPv6 Ping" # 560108416307135435e3168fe05a398b
  • Firewall dynamic rule displaying blocks for rules that make no sense:
But the one that finished me off:

       
  • My mail server was unable to accept connections properly - Internet access was fine, access to my websites appeared ok, but my mail server was basically dead to the outside.
Things I tried:

       
  • Restoring the last backup I did before the upgrade
  • Factory defaulting then restoring the last backup
  • Fresh install and restoring the last backup
None of these made any difference. So after 4 hours I re-installed a fresh copy 17.7.5, imported my backup, upgraded back to 17.7.12_1, installed the missing plug-ins. Literally the moment the firewall came back on-line after rebooting from the restore, my mail server started receiving emails again.

Can you post a Screenshot of rules and port forwards?

Sure, see attached - excuse the redactions don't like to show too much on a public forum.

I meant to add, my WAN is also a PPoE connection, which usually tends to throw up bugs in new versions of the firewall.

January 29, 2018, 10:42:21 PM #4 Last Edit: January 29, 2018, 10:45:12 PM by frank_p
First off all, thanks a lot for every effort you made to release V18. that's really great.

I am using portforwarding (nat rules) to forward SSL traffic from  DMZ based mail-proxy or ssl-proxy to other servers in the LAN-Area.

Since updated from 17 to 18 forwarding of incoming https-traffic (443) from DMZ to LAN is not working.

1.) before i deactivated listen port in admin for web-gui from all (default) to lan, every ssl request was returned from web-gui certificate (which was the wrong one :))

2.) i changed the web-gui listen port to LAN to ensure access from internal lan. external forwarding to my mail-proxy or ssl-proxy is now not longer answered from (wrong) web-gui certificate of opnsense, BUT the mail-proxy and ssl-proxy is responding with "ERR_SSL_PROTOCOL_ERROR". Means all firewall-rules and NAT-rules working but the "ERR_SSL_PROTOCOL_ERROR" is somehow (i dont know where) in the communication of the firewall to the DMZ based proxys.

@Taomin:

got me an hour debugging before figuring it out.

If using rules for ICMP:

- do NOT mix v6 and v4 rules (don't use "IPv4+IPv6" version constraint)
- use protocol "IPV6-ICMP" for any rule regarding ICMP and v6
- use protocol "ICMP" for any rule regarding ICMP and v4

Quote from: nasq on January 29, 2018, 11:17:02 PM

If using rules for ICMP:



None of my ICMP rules are 4+6, and the GUI doesn't allow it anyway.

It does actually. If you create a new rule it selects IPv4 automatically and that is ok. If you are unsure try saving it even though it looks ok. Underneath it really tries to apply IPv4+IPv6, the pf.conf syntax error is pretty telling and consistent with what others have seen.


Cheers,
Franco

January 29, 2018, 11:30:47 PM #8 Last Edit: January 29, 2018, 11:33:06 PM by nasq
The error message you quoted seems to tell that there is a rule regarding ICMP.
Quote
There were error(s) loading the rules: /tmp/rules.debug:143: proto icmp doesn't match address family inet6 - The line in question reads [143]: pass in quick on pppoe0 reply-to ( pppoe0 fe80::eab7:48ff:fed9:a00 ) inet6 proto icmp from {any} to {any} keep state label "USER_RULE: Allow IPv6 Ping" # 560108416307135435e3168fe05a398b

> None of my ICMP rules are 4+6
The error also occurs when you have a v6 only rule which uses the protocol "ICMP" instead of "IPV6-ICMP"

Take it back, it's our code since that has been in FreeBSD at least since 10.0:

https://github.com/opnsense/src/blob/master/sbin/pfctl/parse.y#L4634-L4640


Cheers,
Franco

I had the same problem with IPv6-ICMP during RC testing. It's been noted already that it's a PITA for those who are not aware of it. Either it needs alphabetically sorting so it sits next to ICMP or it gets auto detected depending on whether it's v4 or v6.

The message I posted in the RC threads does not seem to be around now, but those that hit this problem have my sympathy!
OPNsense 24.7 - Qotom Q355G4 - ISP - Squirrel 1Gbps.

Team Rebellion Member

Quote from: franco on January 29, 2018, 11:30:38 PM
It does actually. If you create a new rule it selects IPv4 automatically and that is ok. If you are unsure try saving it even though it looks ok. Underneath it really tries to apply IPv4+IPv6, the pf.conf syntax error is pretty telling and consistent with what others have seen.


Cheers,
Franco


Quote from: nasq on January 29, 2018, 11:30:47 PM
The error message you quoted seems to tell that there is a rule regarding ICMP.
Quote
There were error(s) loading the rules: /tmp/rules.debug:143: proto icmp doesn't match address family inet6 - The line in question reads [143]: pass in quick on pppoe0 reply-to ( pppoe0 fe80::eab7:48ff:fed9:a00 ) inet6 proto icmp from {any} to {any} keep state label "USER_RULE: Allow IPv6 Ping" # 560108416307135435e3168fe05a398b

> None of my ICMP rules are 4+6
The error also occurs when you have a v6 only rule which uses the protocol "ICMP" instead of "IPV6-ICMP"


Ok, so that's a bug in v17 and so when v18 parses the rules it's flagged. I've amended the rule so at least in the future it should be correct - I suspect what happened was that the IPv6 rule was a clone of the IPv4 rule and the protocol did not get picked up as I actually didn't know there was an IPv6 version.

In the end as I originally wrote this wasn't the deal breaker, as I disabled then deleted the broken rule to get the firewall to finally start up. After that, the real problem started i.e. my mail server not receiving any connections.

Found it. Refactor broke logic. But I'd never expect pfctl being so picky about it, it's a seemingly impossible combination, but so is port 0 and pf takes that to be able to enforce integrity using a block.

https://github.com/opnsense/core/commit/a591cf141


Cheers,
Franco

So your code change will set the correct protocol ICMP or IPV6-ICMP according to the rule's selected inet protocol.
This will avoid misconfiguring. Great.

It did that before, and did so on 17.7, but the code went through a refactor for the NAT rules so that code moved and $ipproto was not defined anymore.