Discussion:
[strongSwan] Problem with active-active cluster and traffic handling
Jean-Daniel Dupas
2018-07-12 13:43:53 UTC
Permalink
Hello,

I'm trying to setup an active-active HA cluster. Actually, I'm close to have a full working setup, but I have a blocking issue.

I have installed a custom kernel (4.15.x family), and setup the CLUSTERIP as described in the HA guide ( https://wiki.strongswan.org/projects/strongswan/wiki/HighAvailability )

Both my nodes receive the traffic, and they properly managed the cluster IP to handle only half of the packets.
When I'm establishing a session, only one node handle it (as expected), and the other one setup a passive IKE_SA.

My problem is that once the session is up, sometimes, this is the passive node (for that session) that takes over the IPSec traffic and the active node completely ignore it.
If I sniff the incoming traffic (tcpdump), the decrypted traffic is only detected on the node that setup a passive IKE_SA, and not on the node with the active IKE_SA.

To make it clear, telling I have 2 servers: Alice and Moon. The IKE session is established on moon, and a passive session is created on Alice, but then the decrypted traffic only show up on Alice.
As Alice is a passive node and don't have iptables entry and routes created to handle that traffic, it rejects it (as expected).

Does anyone know what can cause that inconsistency ?

My iptable rules look like this:

-A INPUT -i enp1s1 -d 1.2.3.4 -j CLUSTERIP --new --hashmode sourceip --clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 0

I don't think this is relevant, but I'm using strongswan systemd 5.6.2 (swanctl) on Ubuntu 18.04.

Thanks
Jean-Daniel Dupas
2018-07-12 14:36:21 UTC
Permalink
Post by Jean-Daniel Dupas
Hello,
I'm trying to setup an active-active HA cluster. Actually, I'm close to have a full working setup, but I have a blocking issue.
I have installed a custom kernel (4.15.x family), and setup the CLUSTERIP as described in the HA guide ( https://wiki.strongswan.org/projects/strongswan/wiki/HighAvailability )
Both my nodes receive the traffic, and they properly managed the cluster IP to handle only half of the packets.
When I'm establishing a session, only one node handle it (as expected), and the other one setup a passive IKE_SA.
My problem is that once the session is up, sometimes, this is the passive node (for that session) that takes over the IPSec traffic and the active node completely ignore it.
If I sniff the incoming traffic (tcpdump), the decrypted traffic is only detected on the node that setup a passive IKE_SA, and not on the node with the active IKE_SA.
To make it clear, telling I have 2 servers: Alice and Moon. The IKE session is established on moon, and a passive session is created on Alice, but then the decrypted traffic only show up on Alice.
As Alice is a passive node and don't have iptables entry and routes created to handle that traffic, it rejects it (as expected).
Does anyone know what can cause that inconsistency ?
-A INPUT -i enp1s1 -d 1.2.3.4 -j CLUSTERIP --new --hashmode sourceip --clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 0
I don't think this is relevant, but I'm using strongswan systemd 5.6.2 (swanctl) on Ubuntu 18.04.
Thanks
I found an hint in the swanctl logs:

07[CFG] installed HA CHILD_SA net{3} 0.0.0.0/0 ::/0 === 10.192.3.3/32 (segment in: 2*, out: 1)

strongswan explicitly choose different segments for input and output. The segment where the connection was established here is the segment 1.

As it defines segment 2 for input traffic, it obviously does not works. Is there some settings to force strongswan to always use the segment who own the connection for input and output traffic ?
Tobias Brunner
2018-07-20 08:30:47 UTC
Permalink
Hi Jean-Daniel,
Post by Jean-Daniel Dupas
07[CFG] installed HA CHILD_SA net{3} 0.0.0.0/0 ::/0 === 10.192.3.3/32
(segment in: 2*, out: 1)
strongswan explicitly choose different segments for input and output.
The segment where the connection was established here is the segment 1.
As it defines segment 2 for input traffic, it obviously does not works.
Why shouldn't that work? The same thing happened in our regression
testing framework [1]. Since the hashes for ESP traffic include the
SA's SPI and destination address the SAs might be handled by different
nodes in the active-active scenario (for IKE traffic only the client's
IP is hashed), refer to [2] for some background.

Regards,
Tobias

[1]
https://www.strongswan.org/testing/testresults/ha/both-active/moon.daemon.log
[2] https://wiki.strongswan.org/projects/strongswan/wiki/HighAvailability
Continue reading on narkive:
Loading...