Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
kitkat0981
New Contributor

VPN Tunnel fail when removing SLAVE from HA

Hi All,

 

I have experienced and issue that I was not prepared to encounter. Here is the scenario:

 

1000D in HA running 5.2.4

Multiple IPSEC tunnels with OSPF running inside

 

Upgrade reasoning:

In order to upgrade to 6.0.5, wanted to take the slave out, upgrade it to 6.0.5 using the upgrade path as recomended by Fortinet, and put back in service at a later maintenance window. This is to shorten the outage from 6 outages due to the upgrade path to only 1 (swap of the old master to the new 6.0.5 master). Upgrading an HA set that has VPN tunnels and OSPF riding overtop will cause an outage everytime the ha fails over for each upgrade step. IPSEC HA sessions are not carriered over.

 

outcome:

instead of doing the exec shutdown on the slave, i went and disconnected the 2 HA ports, and 1/2 sec later disconnected the 2 10Gb ports that all the traffic passes through. This caused all the tunnels to go down and ultimately the OSPF routing also. For some reason, the tunnels would not come up. Having management breathing down my neck, I re-instated the Slave and reconnected the 4 cables (2xHA and 2x10Gb). after a few minutes seeing the tunels where not comming back up, i forced resettes on of the tunnels and all came back on-line.

 

Now, of course, need to do a post mortum and root cause analysis, but I have no data to go on apart from the fact that the firewalls are running old unsupported release.

 

I know that there is a MTU mismatch issue on OSPF interfaces introduced in 5.2 but not sure if this is the cause.

 

Anyone have other ideas or have encountered this issue before?

regards,

 

NP

regards, NP
1 REPLY 1
Toshi_Esumi
Esteemed Contributor III

I'm not sure why the tunnels didn't come back up when you normalized the connections. But the mistake you made was you disconnected HA before isolating the slave. When you disconnect HA, both would act as master because it doesn't see the other end, which cause conflict on the 10GB connections. So always disconnect user traffic (monitored?) connections first on the slave to isolate it, check the HA status with "get sys ha status" on both units through the console, then disconnect HA to avoid syncing while you work on upgrade on one side.

"get sys ha status" tells reasons for any HA status changes and monitored&HA interface issues. Keep checking it time to time.

 

 

Labels
Top Kudoed Authors