It's probably a rule for scrubbing or something else being configured suboptimal vs. pfSense.
You can play with Normalization.http://x.x.x.x/firewall_scrub.phpAdd rule - On Interface Wireguard Groupmax MSS. 1300
Or in instance tick advanced and set MTU to the same value on all devices.
- use more threads (-P4 or -P8 are equal, only -P1 is a little slower).
My CPU should be more or less comparable to the C3758R, even somewhat fast in single-thread application.....But when I look at "top" with threads and system processes enabled, I can see the kernel at ~300% (the rest is interrupts and user processes), so all 4 threads seem to get utilized.