Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
goultardgrovy
New Contributor

High current session, leads to high CPU and causes internet downtime

Good day!

 

I'm in charge of networking for a company, this company have roughly 150 employees in office. 

 

 

However, recently we encounter an issue, there seems an high CPU usage and this causes the outgoing/incoming internet traffic to crawl and all employee unable to connect to the internet. When it occurs, the Current Session (GUI) reported up to 5,000 sessions! To further investigate this matter, I spend weeks observing the traffic in Current Session (GUI), in normal usage, fortigate 60D able to handle 1,000 session to 2,700 session without hiccups.

 

We even tried use CLI to observe the stats and it reports ipsmonitor and scanunitd uses the most resources during high load.

 

Another note is that during high cpu load, while internet traffic is essentially disabled. LAN still work, employees are able to access NAS through the network, (DHCP is handled by the same Fortigate 60D). This mean that the fortigate is still working intranet, but somehow 'stuck' in internet.

 

According to this spec provided by Fortigate, 60D suppose able to handle Concurrent Session (TCP) up to 500,000! but why it can't even handle 5,000 Current Session??

 

Did we setting up the policy wrongly? How can I handle the traffic in a better way to have high availability? or should we upgrade to a better fortigate firewall like 100 series?

 

P/s: I can't upload more than 1 file. Where can I attach more file for better conveying the message?

2 Solutions
ede_pfau
Esteemed Contributor III

hi,

 

and welcome to the forums.

 

I'd say either the IPS or UTM settings in general are set too tight, or the hardware is undersized.

 

There are numerous posts here or articles in the KB that handle 'preserving memory' in regard to UTM features. If you just enable ALL IPS signatures without selecting then this box (or any other!) can easily be overwhelmed. Select carefully, filter by OS, split traffic into several policies so that you can apply different UTM profiles.

 

Regarding hardware limits, it's not the number of sessions per se. These are limited by the available memory which depends on the model. 5000 sessions should be no problem even for a 60D.

From the stats (diag sys perf stat) you can see the session build rate ('new sessions per second'). If it reaches, say, more than 100, this model is undersized.

I mean, the 60D is a desktop model for the home office/small office and you run ~ 300 devices in 4 locations over it. Can work, can fail. Take a look at a 100E or even a 60E, they have much more muscles.

 

 


Ede

"Kernel panic: Aiee, killing interrupt handler!"

View solution in original post

Ede"Kernel panic: Aiee, killing interrupt handler!"
ede_pfau
Esteemed Contributor III

hi again,

 

lots of questions but that's OK. Same for me some years ago...

First off, get off of v5.6.1. On a 60D, this is just overkill. I tested it myself and had to downgrade to v5.4.5 just to make it work again. v5.4.5 is really stable and offers a lot of features as well. Do yourself a favor and downgrade. NOTE: this might delete parts / all of the config so get a backup first, and be prepared for some downtime. Good job on a Sunday!

 

Second, yes IPS = Intrusion Prevention System, a.k.a. "attack". This is likely your biggest problem after downgrading. I would only enable IPS for servers (cross site scripting, injection etc.) and  not for clients. You may add more in the future. For this to happen, split your traffic among several policies, with different source addresses for servers and clients. AV is mandatory for all hosts, no question. In the AV profile you should always enable "block botnet connections".

I'd leave out Antispam and DLP for the moment, it's not quite effective anyway (YMMV).

 

No, there is no traffic shaping for incoming traffic, only for outbound. By principle.

 

Sorry for the wrong command, you're right. I see that you already are familiar with the CLI :)

 

Nice picture at the end, a picture tells more than a thousand words...what would you expect with 99% CPU load? looks like this is the regular "morning mail download" spike. Split the policies, reduce UTM heavily and watch this again next morning. After downgrading :-).

 

200 sessions per host is not unlikely for web browsing (but for email only). So, the total number of sessions and the sessions per host are OK.

 

And again, if the suggested remedies do not bring CPU spikes down to ~50% think about upgrading the hardware. 350+ devices on the smallest Fortigate is asking for trouble (no, in my opinion there is NO real Fortigate below a model 60).

 

 


Ede

"Kernel panic: Aiee, killing interrupt handler!"

View solution in original post

Ede"Kernel panic: Aiee, killing interrupt handler!"
4 REPLIES 4
ede_pfau
Esteemed Contributor III

hi,

 

and welcome to the forums.

 

I'd say either the IPS or UTM settings in general are set too tight, or the hardware is undersized.

 

There are numerous posts here or articles in the KB that handle 'preserving memory' in regard to UTM features. If you just enable ALL IPS signatures without selecting then this box (or any other!) can easily be overwhelmed. Select carefully, filter by OS, split traffic into several policies so that you can apply different UTM profiles.

 

Regarding hardware limits, it's not the number of sessions per se. These are limited by the available memory which depends on the model. 5000 sessions should be no problem even for a 60D.

From the stats (diag sys perf stat) you can see the session build rate ('new sessions per second'). If it reaches, say, more than 100, this model is undersized.

I mean, the 60D is a desktop model for the home office/small office and you run ~ 300 devices in 4 locations over it. Can work, can fail. Take a look at a 100E or even a 60E, they have much more muscles.

 

 


Ede

"Kernel panic: Aiee, killing interrupt handler!"
Ede"Kernel panic: Aiee, killing interrupt handler!"
goultardgrovy

Hi Ede,

 

Thank you so much for the input! 

 

Our Fortigate 60D firmware is using v5.6.1 build1484(GA)

Our Internet Service Provider provides 30 Mbps, fiber type.

 

However, I do have Few questions in regards on your answers:

 

ede_pfau wrote:

There are numerous posts here or articles in the KB that handle 'preserving memory' in regard to UTM features. If you just enable ALL IPS signatures without selecting then this box (or any other!) can easily be overwhelmed. Select carefully, filter by OS, split traffic into several policies so that you can apply different UTM profiles.

 

 

Oh, maybe this is the one we are looking for. As of now, we just enable the UTM (is it Security Profiles?) and only 1 policy group (that handles ~350+ devices, that includes Desktop LAN, Mobile devices through AP).

 

In regards of UTM profile, it has many module within it, e.g:

AntiVirus

Web Filter

DNS Filter

Application Control

Intrusion Prevention (Is this IPS?)

Anti-Spam

Data Leak Prevention (We haven't enable this, we didn't foresee any threat in regards on this)

FortiClient Profiles

 

According to you that UTM is overloaded, this could mean that Intrusion Prevention signatures is being applied too much. Is it possible to use the most minimum possible with maximum protection? Where can I read up more on this? in Knowledge Base?

 

Base on my limited knowledge in QoS, is it possible to limit the incoming bandwidth on each user? (our Internet Service Provider do not support this capability)

 

ede_pfau wrote:

 

Regarding hardware limits, it's not the number of sessions per se. These are limited by the available memory which depends on the model. 5000 sessions should be no problem even for a 60D.

From the stats (diag sys perf stat) you can see the session build rate ('new sessions per second'). If it reaches, say, more than 100, this model is undersized.

 

 

diag sys perf stat [didn't work]

get sys perf stat [work]

 

CPU states: 25% user 24% system 0% nice 51% idle CPU0 states: 25% user 24% system 0% nice 51% idle Memory: 1883936k total, 860464k used (45%), 1023472k free (55%), 113352k buffers Average network usage: 11404 / 8394 kbps in 1 minute, 13466 / 8963 kbps in 10 minutes, 12826 / 9316 kbps in 30 minutes Average sessions: 1158 sessions in 1 minute, 1223 sessions in 10 minutes, 1277 sessions in 30 minutes Average session setup rate: 9 sessions per second in last 1 minute, 9 sessions per second in last 10 minutes, 10 sessions per second in last 30 minutes Average NPU sessions: 0 sessions in last 1 minute, 0 sessions in last 10 minutes, 0 sessions in last 30 minutes Virus caught: 0 total in 1 minute IPS attacks blocked: 0 total in 1 minute Uptime: 17 days, 3 hours, 36 minutes

 

 

I should be looking at Average session setup rate, correct? This stat is run on off-load time (when nobody is working on the workstation), I think I should run this on peak load to get more insight.

 

Ant other thing I should pay attention to?

 

Really appreciate your insight!

 

p/s: this is the session that captured just before it exploded and the internet outgoing/incoming crawl (until it reported DNS error on Chrome on each and every employee monitor)

 

ede_pfau
Esteemed Contributor III

hi again,

 

lots of questions but that's OK. Same for me some years ago...

First off, get off of v5.6.1. On a 60D, this is just overkill. I tested it myself and had to downgrade to v5.4.5 just to make it work again. v5.4.5 is really stable and offers a lot of features as well. Do yourself a favor and downgrade. NOTE: this might delete parts / all of the config so get a backup first, and be prepared for some downtime. Good job on a Sunday!

 

Second, yes IPS = Intrusion Prevention System, a.k.a. "attack". This is likely your biggest problem after downgrading. I would only enable IPS for servers (cross site scripting, injection etc.) and  not for clients. You may add more in the future. For this to happen, split your traffic among several policies, with different source addresses for servers and clients. AV is mandatory for all hosts, no question. In the AV profile you should always enable "block botnet connections".

I'd leave out Antispam and DLP for the moment, it's not quite effective anyway (YMMV).

 

No, there is no traffic shaping for incoming traffic, only for outbound. By principle.

 

Sorry for the wrong command, you're right. I see that you already are familiar with the CLI :)

 

Nice picture at the end, a picture tells more than a thousand words...what would you expect with 99% CPU load? looks like this is the regular "morning mail download" spike. Split the policies, reduce UTM heavily and watch this again next morning. After downgrading :-).

 

200 sessions per host is not unlikely for web browsing (but for email only). So, the total number of sessions and the sessions per host are OK.

 

And again, if the suggested remedies do not bring CPU spikes down to ~50% think about upgrading the hardware. 350+ devices on the smallest Fortigate is asking for trouble (no, in my opinion there is NO real Fortigate below a model 60).

 

 


Ede

"Kernel panic: Aiee, killing interrupt handler!"
Ede"Kernel panic: Aiee, killing interrupt handler!"
goultardgrovy

ede_pfau wrote:

hi again,

 

lots of questions but that's OK. Same for me some years ago...

First off, get off of v5.6.1. On a 60D, this is just overkill. I tested it myself and had to downgrade to v5.4.5 just to make it work again. v5.4.5 is really stable and offers a lot of features as well. Do yourself a favor and downgrade. NOTE: this might delete parts / all of the config so get a backup first, and be prepared for some downtime. Good job on a Sunday!

 

Second, yes IPS = Intrusion Prevention System, a.k.a. "attack". This is likely your biggest problem after downgrading. I would only enable IPS for servers (cross site scripting, injection etc.) and  not for clients. You may add more in the future. For this to happen, split your traffic among several policies, with different source addresses for servers and clients. AV is mandatory for all hosts, no question. In the AV profile you should always enable "block botnet connections".

I'd leave out Antispam and DLP for the moment, it's not quite effective anyway (YMMV).

 

No, there is no traffic shaping for incoming traffic, only for outbound. By principle.

 

Sorry for the wrong command, you're right. I see that you already are familiar with the CLI :)

 

Nice picture at the end, a picture tells more than a thousand words...what would you expect with 99% CPU load? looks like this is the regular "morning mail download" spike. Split the policies, reduce UTM heavily and watch this again next morning. After downgrading :-).

 

200 sessions per host is not unlikely for web browsing (but for email only). So, the total number of sessions and the sessions per host are OK.

 

And again, if the suggested remedies do not bring CPU spikes down to ~50% think about upgrading the hardware. 350+ devices on the smallest Fortigate is asking for trouble (no, in my opinion there is NO real Fortigate below a model 60).

 

 

Thank you for your advice! Sir!

 

We have fine tune it, it was the IPS applied to each and every workstation (one of the major source that causes the internet halt), and every morning, when the employees start up the computer, the PC surely check for windows update (we don't have a centralized windows update or a caching server like pfsense). We only apply IPS on all the servers (ironic, the supplier that help us to install the fortigate we purchased from them DID NOT apply IPS on the servers! but that time was IT dept was a different team)

 

So far after 28 days of monitoring, even with 4,000++ session and CPU 99%, the internet is not halt as before, just a little slower. We are currently propose to upgrade the 60D to 60E (or maybe more). We didn't downgrade the firmware, in the end.

Labels
Top Kudoed Authors