1
General Discussion / bgp_process_packet: BGP OPEN receipt failed for peer: n.n.n.n
« on: August 02, 2023, 12:53:49 am »
I have an awesome home network setup that now revolves around an OPNSense router. So, massive thanks and kudos to the devs and the whole community.
I have been trying to configure BGP to gather routes from my home K8S cluster and cloud-based K8S clusters and redistribute them to each other. I had it basically working, but then for some reason it started spitting out these errors, one or two per second, which I'm trying to investigate...
```
bgpd[79135] [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 10.234.234.7
```
Looking into the FRR source, I see this is generated on this line in the `bgp_process_packet` function.
https://github.com/FRRouting/frr/blob/5da58d355a094100ddedb861aa5555be8a4ea1bf/bgpd/bgp_packet.c#L2926
Basically, it's triggered if the ` bgp_open_receive` function returns `BGP_Stop`. However, there are a number of reasons this could happen, and the problem I am facing is that I am not seeing the reason logged anywhere, which makes it difficult to determine which step it's failing or what might have broken since it was working.
Within the `bgp_open_receive`, it attempts to do various things and make various checks. If any of these steps fails, it 'flog_err's the message, sends a NOTIFY and returns `BGP_Stop`. In some cases though, it 'zlog's the error. Not sure why that inconsistency exists in the upstream code, but I expect there is a reason.
https://github.com/FRRouting/frr/blob/5da58d355a094100ddedb861aa5555be8a4ea1bf/bgpd/bgp_packet.c#L1365
Given that I see the 'receipt failed for peer' message that is 'flog_err'ed with EC_BGP_PKT_OPEN, I would also expect to see the error for any steps that 'flog_err'ed their condition. So, I suspect that the cause of my problem is one of the conditions that 'zlog's it's error. But which one?!
My question at the moment is, where are the 'zlog's getting sent to?
I have set log level to 'Debugging' in the Routing/General section.
Cheers,
--
Ross
I have been trying to configure BGP to gather routes from my home K8S cluster and cloud-based K8S clusters and redistribute them to each other. I had it basically working, but then for some reason it started spitting out these errors, one or two per second, which I'm trying to investigate...
```
bgpd[79135] [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 10.234.234.7
```
Looking into the FRR source, I see this is generated on this line in the `bgp_process_packet` function.
https://github.com/FRRouting/frr/blob/5da58d355a094100ddedb861aa5555be8a4ea1bf/bgpd/bgp_packet.c#L2926
Basically, it's triggered if the ` bgp_open_receive` function returns `BGP_Stop`. However, there are a number of reasons this could happen, and the problem I am facing is that I am not seeing the reason logged anywhere, which makes it difficult to determine which step it's failing or what might have broken since it was working.
Within the `bgp_open_receive`, it attempts to do various things and make various checks. If any of these steps fails, it 'flog_err's the message, sends a NOTIFY and returns `BGP_Stop`. In some cases though, it 'zlog's the error. Not sure why that inconsistency exists in the upstream code, but I expect there is a reason.
https://github.com/FRRouting/frr/blob/5da58d355a094100ddedb861aa5555be8a4ea1bf/bgpd/bgp_packet.c#L1365
Given that I see the 'receipt failed for peer' message that is 'flog_err'ed with EC_BGP_PKT_OPEN, I would also expect to see the error for any steps that 'flog_err'ed their condition. So, I suspect that the cause of my problem is one of the conditions that 'zlog's it's error. But which one?!
My question at the moment is, where are the 'zlog's getting sent to?
I have set log level to 'Debugging' in the Routing/General section.
Cheers,
--
Ross

