Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - hemirunner426

#1
I seem to have a similar issue with gateways.  I'm not sure how to trace this one down.

Upon upgrade to 23.7.8 all looked fine from webui.  All services started up, but machines connected the router had sporadic internet.  I noticed DHCPv6 seemed to restart a couple times then stay running but machines behind the gateway where having a hard time connecting to sites.  ie: Some sites worked, some don't.  FW logs looked fine.  Packets didn't appear to be dropped.

The only real clue was DHCP restarting.  I took a look at the logs and saw the same log entries as mentioned here.


2023-11-09T14:35:40-07:00 Warning opnsense /usr/local/etc/rc.bootup: The required WAN_6RD IPv6 interface address could not be found, skipping.
2023-11-09T14:35:40-07:00 Warning opnsense /usr/local/etc/rc.bootup: Skipping gateway WAN_6RD due to empty 'gateway' property.


I applied the patch in this thread, restarted, still the same behavior.  I went into gateways and reapplied the same settings and machines behind the router started working properly again on both ipv4 and ipv6.

I can reproduce the problem by restarting the router.  I can go back into gateways and reapply the settings to fix it.

Is there any other information I can gather and provide here?



#2
The 'Errors Out' numbers in the dashboard match up with the respective interface - output errors from the ifinfo command.
#3
21.7 Legacy Series / Re: 6rd Gateway monitoring
January 12, 2022, 02:17:21 PM
The patch took care of the issue.

Thanks!
#4
21.7 Legacy Series / Re: 6rd Gateway monitoring
January 11, 2022, 06:40:53 PM
1. Either ping doesn't work for any address or just the gateway IP, confirm by using 2001:4860:4860::8888 manually as monitor address.

It appears its only '2602::205.171.2.64' that will not respond to ping.  '2001:4860:4860::8888' will respond on the command line but fails as a monitor address.  The log produces the following:

022-01-11T10:33:10 opnsense[7295] /system_gateways.php: The WAN_6RD IPv6 gateway address is invalid, skipping.

2.  I see the echo request but a reply never comes through.
0:36:35.525335 IP 184.4.49.221 > 0.0.0.0: IP6 2602:b8:431:dd00:: > 2602::cdab:206: ICMP6, echo request, seq 10, length 16

3. I don't see IPV6 ICMP being blocked here.  Nothing sticks out to my eye.

4. #netstat -nr
Internet6:
Destination                       Gateway                       Flags     Netif Expire
default                           2602::cdab:240                UGS     wan_stf
::1                               link#8                        UH          lo0
2602::/24                         link#20                       U       wan_stf
2602:b8:431:dd00::                link#20                       UHS         lo0
2602:b8:431:dd00::/64             link#16                       U      igb1_vla
2602:b8:431:dd00::1               link#16                       UHS         lo0
fe80::%igb0/64                    link#1                        U          igb0
fe80::%igb1/64                    link#2                        U          igb1
fe80::2e0:67ff:fe27:ada1%igb1     link#2                        UHS         lo0
fe80::%lo0/64                     link#8                        U           lo0
fe80::1%lo0                       link#8                        UHS         lo0
fe80::%igb0_vlan201/64            link#11                       U      igb0_vla
fe80::2e0:67ff:fe27:ada0%igb0_vlan201 link#11                   UHS         lo0
fe80::%igb1_vlan3/64              link#12                       U      igb1_vla
fe80::2e0:67ff:fe27:ada1%igb1_vlan3 link#12                     UHS         lo0
fe80::%igb1_vlan4/64              link#13                       U      igb1_vla
fe80::2e0:67ff:fe27:ada1%igb1_vlan4 link#13                     UHS         lo0
fe80::%igb1_vlan5/64              link#14                       U      igb1_vla
fe80::2e0:67ff:fe27:ada1%igb1_vlan5 link#14                     UHS         lo0
fe80::%igb1_vlan6/64              link#15                       U      igb1_vla
fe80::2e0:67ff:fe27:ada1%igb1_vlan6 link#15                     UHS         lo0
fe80::%igb1_vlan1/64              link#16                       U      igb1_vla
fe80::2e0:67ff:fe27:ada1%igb1_vlan1 link#16                     UHS         lo0
fe80::2e0:67ff:fe27:ada0%ovpns1   link#17                       UHS         lo0
fe80::2e0:67ff:fe27:ada0%ovpnc3   link#18                       UHS         lo0
fe80::%pppoe0/64                  link#19                       U        pppoe0
fe80::2e0:67ff:fe27:ada0%pppoe0   2602::cdab:240                UGHS    wan_stf
fe80::2e0:67ff:fe27:ada1%pppoe0   2602::cdab:240                UGHS    wan_stf


#5
21.7 Legacy Series / Re: 6rd Gateway monitoring
January 10, 2022, 08:21:40 PM
ISP: CenturyLink Fiber

# cat /tmp/wan_stf_routerv6
2602::205.171.2.64

# ifconfig wan_stf
wan_stf: flags=4041<UP,RUNNING,LINK2> metric 0 mtu 1280
        inet6 2602:b8:431:dd00:: prefixlen 24
        groups: stf
        v4net 184.4.49.221/0 -> tv4br 205.171.2.64
        nd6 options=101<PERFORMNUD,NO_DAD>


# ping6 2602:b8:431:dd00::
PING6(56=40+8+8 bytes) 2602:b8:431:dd00:: --> 2602:b8:431:dd00::
16 bytes from 2602:b8:431:dd00::, icmp_seq=0 hlim=64 time=0.086 ms
16 bytes from 2602:b8:431:dd00::, icmp_seq=1 hlim=64 time=0.044 ms
16 bytes from 2602:b8:431:dd00::, icmp_seq=2 hlim=64 time=0.038 ms


# ping6 2602::205.171.2.6
PING6(56=40+8+8 bytes) 2602:b8:431:dd00:: --> 2602::cdab:206

--- 2602::205.171.2.6 ping6 statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss



Thanks Franco.
Let me know if you need anything else.
#6
21.7 Legacy Series / 6rd Gateway monitoring
January 10, 2022, 04:49:16 PM
It appears gateway monitoring doesn't work on a 6rd prefix setup over PPPoE.

IPv6 works from the router (via ping6) and services the LAN with no issue.

With the gateway left as default, the gateway shows as online.  If I enable monitoring and put in a google DNS server to monitor the status will change to down.  IPv6 on the router and the LAN is unaffected (operational).

The log shows the following entry:
/system_gateways.php: The WAN_6RD IPv6 gateway address is invalid, skipping.

The gateway (in it's default state) configures the gateway address as:
2602::205.171.2.64

This is verifiable in wan_stf_routerv6 and wan_stf_defaultgwv6

'netstat -r | grep default' has the following output:

default            phn4-dsl-gw11.phn4 UGS      pppoe0
default            2602::cdab:240     UGS     wan_stf




#7
Quote from: JasonJoel on October 16, 2021, 04:01:30 PM
Quote from: athurdent on October 16, 2021, 10:34:01 AM
- a few more policies for the home subscription, to make your average network security admin happy, who's coming home from working with Checkpoint and Cisco. This way we could cover the basics, with a policy each for guest, IoT, kids and parents. Plus one or two to experiment with.

This times 1000. ZenArmor identifying traffic is next to useless if you can't actually use that introspection to DO SOMETHING. And with only 3 policies available, you can't do much of anything if you have a main,  guest, and IoT VLAN - which many people do these days...

Throw in kids vs adult policy needs and you definitely can't do what you need in 3 policies... This is 100% a deal breaker/will not renew my subscription issue for me. So I guess after 11/28 you won't have to put up with my complaining any more.

Your multiple x 1000 again.  ;D
#8
I'd be careful with the realtek nics.  The drivers seem to do fine if you do not do any sort of netmap stuff (IPS/IDS). 

Enabling those types of services seems to make the nic crash.  At this time, I'd say it's YMMV.
#9
I found the lock in the UI.  That's handy.  ;D
#10
Also decided to test this out with a Protectli FW6B.

- WAN is PPPoE 1G/1G
- LAN is split between LAN and 4 VLANs
- Sensei running on LAN.

I set net.inet.rss.bits = 2.  Not sure if this is correct on a 2 core/2HT processor.  Seems to work fine on speedtest with reasonable CPU usage.

Also something to keep in mind, if you upgrade to a hotfix release it will replace the RSS kernel.  Be sure to reinstall the RSS kernel if you upgrade to the hotfix released yesterday.
#11
21.7 Legacy Series / Re: os-realtek-re plugin
September 28, 2021, 05:24:19 PM
Let's slow down. You are mixing up a multiple things:

I wouldn't be surprised.   :)

1. The vendor driver is in src.git master branch and most recent stable branches reaching back to 2017.
    - OK that is what I thought.

2. hw.re.max_rx_mbuf_sz exists ONLY in the newly added realtek-re-kmod port installed by the plugin of the same name.
    - So this is not the os-realtek-re plugin?  If I search 'realtek-re-kmod' in plugins it does not show.  I took these to be the same because their dmesg output when the driver loads appear to be the same.  ie it prints patent and driver version info where the FreeBSD driver does not.  This leaves me wondering why one would make reference to max_rx_mbut while the other is static?

3. The FreeBSD driver supports NATIVE netmap mode, the vendor driver (port or OPNsense src.git) uses the EMULATED driver. I haven't heard a lot of bad things about the emulated driver use so far. In fact, reports were a lot more positive towards EMULATED driver back in 2017 when we did the switch.
    - So this can be used (along side the output from dmesg on boot when the driver loads) to confirm which driver is installed.  Emulated vs native didn't really matter to me.  I wanted to see if I could figure out why my WAN link dies when these services are enabled under any sort of load.

4. I'm unsure what you are trying to achieve. At least we need to establish a better baseline and also inspect the actual hardware chipset you have at hand.

I'm trying to figure out if NIC is bad or if there is a driver issue.  For now, I am going to replace this unit with a Protectli as IPS is important to me.

As of now using anything IDS/IPS related on this device will render it unworkable after some amount of time.  The only hint from dmesg that something is wrong is "re0: reset never completed!".  The only way out of this broken state is a reboot.

I can hang on to the unit for awhile if you'd like me to do some exploratory work.  Here is the output from pciconf:

# pciconf -lbcevV re0
re0@pci0:1:0:0: class=0x020000 card=0x012310ec chip=0x816810ec rev=0x15 hdr=0x00
    vendor     = 'Realtek Semiconductor Co., Ltd.'
    device     = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet
    bar   [10] = type I/O Port, range 32, base 0xe000, size 256, enabled
    bar   [18] = type Memory, range 64, base 0xa1304000, size 4096, enabled
    bar   [20] = type Memory, range 64, base 0xa1300000, size 16384, enabled
    cap 01[40] = powerspec 3  supports D0 D1 D2 D3  current D0
    cap 05[50] = MSI supports 1 message, 64 bit
    cap 10[70] = PCI-Express 2 endpoint MSI 1 max data 128(128) RO
                 link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1)
    cap 11[b0] = MSI-X supports 4 messages, enabled
                 Table in map 0x20[0x0], PBA in map 0x20[0x800]
    ecap 0001[100] = AER 2 0 fatal 0 non-fatal 0 corrected
    ecap 0002[140] = VC 1 max VC0
    ecap 0003[160] = Serial 1 01000000684ce000
    ecap 0018[170] = LTR 1
    ecap 001e[178] = unknown 1
  PCI-e errors = Correctable Error Detected


#12
21.7 Legacy Series / Re: os-realtek-re plugin
September 27, 2021, 08:48:20 PM
This was also a no-go.

I believe this may be a hardware fault/incompatibility.

#13
21.7 Legacy Series / Re: os-realtek-re plugin
September 27, 2021, 04:11:30 PM
I went ahead and compiled the driver from https://github.com/kostikbel/rere
against the OPNSense kernel source.

I replaced the binary if_re.ko in boot/modules with the one I compiled and reboot.

I enabled sensei with the native netmap module (although from the dmesg output it doesn't appear native mode works/is supported).

I will report back and see if these commits take care of my issue.
#14
21.7 Legacy Series / Re: os-realtek-re plugin
September 26, 2021, 07:25:29 AM
So I've been doing some code comparison today.  You'll have to forgive me, C/C++ and driver development is not my expertise...

I was reviewing the Realtek vendor driver in the OPNSense Github here:
https://raw.githubusercontent.com/opnsense/src/21.7.2/sys/dev/re/if_re.c

What I noticed is there is no reference to it reading hw.re.max_rx_mbuf_sz tunable as what is specified in the following README:

QuoteAdd the following lines to your /boot/loader.conf
to override the built-in FreeBSD re(4) driver.

if_re_load="YES"
if_re_name="/boot/modules/if_re.ko"

By default, the size of allocated mbufs is enough
to receive the largest Ethernet frame supported
by the card.  If your memory is highly fragmented,
trying to allocate contiguous pages (more than
4096 bytes) may result in driver hangs.
For this reason the value is tunable at boot time,
e.g. if you don't need Jumbo frames you can lower
the memory requirements and avoid this issue with:

hw.re.max_rx_mbuf_sz="2048"

Unless I am somehow missing it, I don't see how the vendor driver in OPNSense is utilizing this tunable?

I'm using the following branch as a reference which has this sysctl enabled as a tunable:
https://github.com/kostikbel/rere

This person seemed to of had similar issues on a NAS box.  I suspect somewhere along the lines his commits may of been pushed to the FreeBSD driver tree, but for some reason it's not present in OPNSense.

My running theory is the increased load from something like suricata or sensei is causing this memory fragmentation issue and eventually killing the driver.

I may try to built this myself and try it out... That would require me building a dev VM and all that.
If someone would/could be me to it, that would be great!



#15
21.7 Legacy Series / Re: os-realtek-re plugin
September 25, 2021, 11:15:50 PM
So far I've had no luck sifting through the logs that give any indication what goes on.  I had it happen twice in the span of 45 min.

The common symptoms:
1. re0 goes down every time (WAN).  The TX/RX light stops blinking when this occurs.
2. re1 (LAN) remains responsive and functional.
3. CPU spikes to 100%.  Unbound, python, suricata (or sensei related stuff when testing it) are the culprits.  Stopping/killing those processes does not change the state of the system.
4. State table goes though the roof as does memory and eventually swap.
5. A reboot is the only way to get the router back in a fully usable state.

The only relevant thing I see in dmesg is:

re0: reset never completed!

This seems to only happen when sensei or suricata are enabled.  I've only ran suricata in IDS mode and while I was testing sensei I tried native netmap, emulated, and passive.



I'm not sure what else I can do to gather more information?