Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
seadave
Contributor III

sslvpnd crash

Hello-

 

Running 5.4.4 on 500D logging to FAZ VM running 5.6.  Using SSLVPN for remote access with FAC MFA.  On Monday I upgraded my FAZ from 5.4.X to 5.6.  Really like 5.6, but it appears that the FAZ is now opening and closing SSL connections to upload logs every 10 seconds or so.  Way more often then before the upgrade.

 

Yesterday at 11:41PST, my FAZ logged the sslvnpd process crashing on the FG500D, followed by CPU util of 99% and then scanunit process also tanking on the FG.  GUI and Console were non-responsive so I performed a hard reboot.  Upon reboot it was ok for a few minutes but again went to lack of response on console and GUI until I pulled all NICs.  Once all interfaces were down, console started responding.

 

We have a MSSP's firewall upstream of our FG500 (it sits in transparent mode inline) and I traced the issue to that; at least I thought that was the case.  System has been stable since I removed that device.

 

Just got off the phone with the MSSP and they are monitoring their device over a backup link and indicate no failed NICs, or connection errors.

 

So I'm wondering if I experienced some sort of memory link caused by the realtime SSL connections between the FAZ and FG500D?  My FAZ is connected to the FG via a dedicated VLAN/Ports so wondering if I can disable SSL as it isn't really needed in our environment.

 

Also wondering if anyone else has seen this issue.  I attached a log with some of the applicable events caught by the FAZ.

 

Thanks!

1 REPLY 1
seadave
Contributor III

Update to this.  Yesterday I configured a LDAP server on our FG500D (at the time 5.4.4, now 5.4.5) to communicate with FSSO so I could audit user activity and apply some AD groups to policy.  I enabled STARTTLS and all appeared fine, but the firewall crashed shortly after I configured this.  Console was still live so I rebooted, but I was not able to login to the GUI until I removed all interface cables.  After that GUI came up.  I configured LDAP for 389 and it has been stable since.  This is the second time that it appears some type of SSL race condition had taken down the system.  As mentioned above, the last time was after a FAZ upgrade that enabled the SSL encryption between FG and FAZ.  After disabling the "Encrypt Logs" option on the FG, it was fine.  Running "dia deb crashlog read" showed that a segmentation fault had occured earlier in the day:

 

14521: 2017-08-28 10:23:03 <09678> scanunit=manager str="Success loading anti-virus database." 14522: 2017-08-28 10:43:03 scanunit=manager pid=9678 str="AV database changed (1); restarting workers" 14523: 2017-08-28 10:43:03 <09678> scanunit=manager str="Success loading anti-virus database." 14524: 2017-08-28 10:50:40 the killed daemon is /bin/pyfcgid: status=0x0 14525: 2017-08-28 10:53:03 scanunit=manager pid=9678 str="AV database changed (1); restarting workers" 14526: 2017-08-28 10:53:03 <09678> scanunit=manager str="Success loading anti-virus database." 14527: 2017-08-28 11:02:57 <09709> firmware FortiGate-500D v5.4.4,build1117b1117,170209 (GA) (Release) 14528: 2017-08-28 11:02:57 <09709> application wad 14529: 2017-08-28 11:02:57 <09709> *** signal 11 (Segmentation fault) received *** 14530: 2017-08-28 11:02:57 <09709> Register dump: 14531: 2017-08-28 11:02:57 <09709> RAX: 00007f236f91cfff RBX: 00007f237cb24290 14532: 2017-08-28 11:02:57 <09709> RCX: 0000000000008000 RDX: ffffffffffffffe6 14533: 2017-08-28 11:02:57 <09709> R8: 00007f236f91cfff R9: 00007f238b687b60 14534: 2017-08-28 11:02:57 <09709> R10: 000000000000000d R11: 00007f238b6c1b00 14535: 2017-08-28 11:02:57 <09709> R12: 0000000000000001 R13: 00007f237a4675b0 14536: 2017-08-28 11:02:57 <09709> R14: 00000000000000e7 R15: 00000000000000e7 14537: 2017-08-28 11:02:57 <09709> RSI: 00007f2379e82081 RDI: 00007f236f91d080 14538: 2017-08-28 11:02:57 <09709> RBP: 00007fff05b5a1c0 RSP: 00007fff05b5a198 14539: 2017-08-28 11:02:57 <09709> RIP: 00007f238b687bf4 EFLAGS: 0000000000010283 14540: 2017-08-28 11:02:57 <09709> CS: 0033 FS: 0000 GS: 0000 14541: 2017-08-28 11:02:57 <09709> Trap: 000000000000000e Error: 0000000000000006 14542: 2017-08-28 11:02:57 <09709> OldMask: 0000000000000000 14543: 2017-08-28 11:02:57 <09709> CR2: 00007f236f91cfff 14544: 2017-08-28 11:02:57 <09709> Backtrace: 14545: 2017-08-28 11:02:57 <09709> [0x7f238b687bf4] => /fortidev4-x86_64/lib/libc.so.6 liboffset 00130bf4 14546: 2017-08-28 11:02:57 <09709> [0x01458716] => /bin/wad 14547: 2017-08-28 11:02:57 <09709> [0x013c4d07] => /bin/wad 14548: 2017-08-28 11:02:57 <09709> [0x013c4e41] => /bin/wad 14549: 2017-08-28 11:02:57 <09709> [0x013c5a48] => /bin/wad 14550: 2017-08-28 11:02:57 <09709> [0x01411fab] => /bin/wad 14551: 2017-08-28 11:02:57 <09709> [0x013ea3a0] => /bin/wad 14552: 2017-08-28 11:02:57 <09709> [0x013e18df] => /bin/wad 14553: 2017-08-28 11:02:57 <09709> [0x013d6dd1] => /bin/wad 14554: 2017-08-28 11:02:57 <09709> [0x013d6e11] => /bin/wad 14555: 2017-08-28 11:02:57 <09709> [0x013d792f] => /bin/wad 14556: 2017-08-28 11:02:57 <09709> [0x013da9c9] => /bin/wad 14557: 2017-08-28 11:02:57 <09709> [0x01487098] => /bin/wad 14558: 2017-08-28 11:02:57 <09709> [0x014599d2] => /bin/wad 14559: 2017-08-28 11:02:57 <09709> [0x0149869e] => /bin/wad 14560: 2017-08-28 11:02:57 <09709> [0x00427a8c] => /bin/wad 14561: 2017-08-28 11:02:57 <09709> [0x0042e282] => /bin/wad 14562: 2017-08-28 11:02:57 <09709> [0x7f238b578475] => /fortidev4-x86_64/lib/libc.so.6 14563: 2017-08-28 11:02:57 (__libc_start_main+0x000000f5) liboffset 00021475 14564: 2017-08-28 11:02:57 <09709> [0x004250a9] => /bin/wad 14565: 2017-08-28 11:02:57 <09709> proess=wad type=3 pid=9709 total=7996 free=5157 mmu=165461552 14566: 2017-08-28 11:02:57 mu=91448083 m=1440129479 f=1440026121 r=0 14567: 2017-08-28 11:02:57 <09709> current tcp(0x7f236ba03650) vf=0 session-id=40023178 app_type=1 14568: 2017-08-28 11:02:57 dyn_type=0 state=2 cur_bank=0x11a097a0 cur_tl=0x11ead2e0 cur_tm=0x7f237b2dd400 14569: 2017-08-28 11:02:57 14570: 2017-08-28 11:15:03 scanunit=manager pid=9678 str="AV database changed (1); restarting workers" 14571: 2017-08-28 11:15:03 <09678> scanunit=manager str="Success loading anti-virus database."

 

But the main crash occured at 18:30 (little is shown in the crashlog to support any root cause):

 

14637: 2017-08-28 18:01:07 scanunit=manager pid=9678 str="AV database changed (1); restarting workers" 14638: 2017-08-28 18:01:07 <09678> scanunit=manager str="Success loading anti-virus database." 14639: 2017-08-28 18:11:06 scanunit=manager pid=9678 str="AV database changed (1); restarting workers" 14640: 2017-08-28 18:11:06 <09678> scanunit=manager str="Success loading anti-virus database." 14641: 2017-08-28 18:15:33 the killed daemon is /bin/pyfcgid: status=0x0 14642: 2017-08-28 18:18:14 scanunit=manager pid=9678 str="AV database changed (1); restarting workers" 14643: 2017-08-28 18:18:15 <09678> scanunit=manager str="Success loading anti-virus database." 14644: 2017-08-28 18:33:06 scanunit=manager pid=9678 str="AV database changed (1); restarting workers" 14645: 2017-08-28 18:33:06 <09678> scanunit=manager str="Success loading anti-virus database."

 

Under "config sys global" I do have "set strong-crypto enable" configured.  My dh-params are set to 2048.

 

TIA for any ideas.

 

 

Labels
Top Kudoed Authors