Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
itismo
New Contributor

Lost with dataset to sort top applications sorted by session count for each destination IP

I am trying to generate a report to show the top applications by session count and the destination IPs.

 

Essentially I have :

Dataset 1 to show top applications by sessions.

select appid, app, appcat, (case when (utmaction in ('block', 'blocked') or action='deny') then 'Blocked' else 'Allowed' end) as custaction, sum(coalesce(sentbyte, 0)+coalesce(rcvdbyte, 0)) as bandwidth,count(*) as num_session from $log where $filter and logid_to_int(logid) not in (4, 7, 14) and nullifna(app) is not null and policyid != 0 group by appid,app, appcat,custaction order by bandwidth desc

 

Dataset 2 to show all the destination IPs by application and number of sessions.

select dstip, app, appid, hostname, count(*) as total_num from $log where $filter and logid_to_int(logid) not in (4, 7, 14) group by dstip, app, appid, hostname order by total_num desc

 

Dataset 2 does not sort the results by the *top* applications like in dataset 1 but rather by the session count, i have tried every SQL statement that can be thought of to do the sorting but i am not an expert when it comes to combining SQL queries.

 

What i want now is to combine both datasets to actually do a drill down of dataset 1.

 

Any help please?

1 Solution
AtiT
Valued Contributor

Hello, I have a few things that you should thinking about in my opinion. ...Other ideas are also welcome...

 

1. In dataset 1 and 2 I recommend to use this: logid_to_int(logid) not in (4, 7, 14, 20) Add the 20 logid into the list

2. The dataset 2 is sorted by the session count because you said that you want to sor it that way: order by total_num desc In this case the whole row is tested to match another row and the exact match will increase the count number.

 

I think that in your case you need to count the application name first to find out how many times the application is in the logs and after that sort the result... somehow. So you will have the most used applications at the top and other details will be attached to them.

That means that yo need to use something that is called Window Function.

 

You can try this (traffic-log):

select `appid` as app_id,        `app` as application,        `hostname` as host_name,        `dstip` as destination,        sum(total_num) as sessions,        sum(total_num) over (partition by `app`) as app_totalnum from ###(     select `dstip`,            `app`,            `appid`,            `hostname`,            count(*) as total_num     from $log     where $filter           and logid_to_int(`logid`) not in (4, 7, 14, 20)     group by `dstip`,              `app`,              `appid`,              `hostname`     order by total_num desc )### as t group by app_id,          application,          host_name,          destination,          total_num having sum(total_num)>0 order by app_totalnum desc,          sessions desc

Please try this dataset maybe it will be something close you need.

The row called app_totalnum is a temporary result or "helping" result to sort the applications. Do not include this row in the chart.

 

Please find the row: having sum(total_num)>0 In this case, when the number is 0 this row is not needed. It will be helpful in case that you will need to filter out some visits from the results. probably you will be not intrested in results that has only 1 visits, or 2 visits, maybe you want to filter out visits up to 10. In this case you can use to filter out them like: having sum(total_num)>10 I you do not need filter out anything than delete this line. There is no reason do a summary of something that is not used.

I hope it helped in some way or at least this gave some useful information to someone.

AtiT

View solution in original post

AtiT
7 REPLIES 7
AtiT
Valued Contributor

Hello, I have a few things that you should thinking about in my opinion. ...Other ideas are also welcome...

 

1. In dataset 1 and 2 I recommend to use this: logid_to_int(logid) not in (4, 7, 14, 20) Add the 20 logid into the list

2. The dataset 2 is sorted by the session count because you said that you want to sor it that way: order by total_num desc In this case the whole row is tested to match another row and the exact match will increase the count number.

 

I think that in your case you need to count the application name first to find out how many times the application is in the logs and after that sort the result... somehow. So you will have the most used applications at the top and other details will be attached to them.

That means that yo need to use something that is called Window Function.

 

You can try this (traffic-log):

select `appid` as app_id,        `app` as application,        `hostname` as host_name,        `dstip` as destination,        sum(total_num) as sessions,        sum(total_num) over (partition by `app`) as app_totalnum from ###(     select `dstip`,            `app`,            `appid`,            `hostname`,            count(*) as total_num     from $log     where $filter           and logid_to_int(`logid`) not in (4, 7, 14, 20)     group by `dstip`,              `app`,              `appid`,              `hostname`     order by total_num desc )### as t group by app_id,          application,          host_name,          destination,          total_num having sum(total_num)>0 order by app_totalnum desc,          sessions desc

Please try this dataset maybe it will be something close you need.

The row called app_totalnum is a temporary result or "helping" result to sort the applications. Do not include this row in the chart.

 

Please find the row: having sum(total_num)>0 In this case, when the number is 0 this row is not needed. It will be helpful in case that you will need to filter out some visits from the results. probably you will be not intrested in results that has only 1 visits, or 2 visits, maybe you want to filter out visits up to 10. In this case you can use to filter out them like: having sum(total_num)>10 I you do not need filter out anything than delete this line. There is no reason do a summary of something that is not used.

I hope it helped in some way or at least this gave some useful information to someone.

AtiT

AtiT
itismo
New Contributor

AtiT, I can't thank you enough! That really worked :)

 

I spent a lot of time trying to figure this out, how long did it take you to make this?

itismo
New Contributor

Hello again AtiT,

 

Actually I noticed a small issue with the table, some entries for same app, same destination IP and same hostname are not grouped together and appear in different rows.

 

I have tried to figure out anything that might be causing this but according to the results returned by queries, there should be no different value causing the row split.

I have attached a screenshot, really your help is appreciated.

AtiT
Valued Contributor

Hi,

Do you see the same results when you run the dataset only with the Test button?

AtiT

AtiT
itismo
New Contributor

Hi AtiT,

 

Actually the test button is not working for me on this FortiAnalyzer, does not return anything so I cannot use it.

 

Why are you asking about the "test button" results anyways? Would it be a difference between using it and the data returned in the chart?

 

AtiT
Valued Contributor

Hello,

yes it is different. Using the Dataset Test button the FAZ does not using cache. In reports caches are used.

 

In our case I can see the problem. We have the sime lines multiple times so we have to summarize the results into one line.

Try this dataset:

 

 

select app_id,        application,        host_name,        destination,        sum(sessions) as sessions,        app_totalnum from (

    select `appid` as app_id,            `app` as application,            `hostname` as host_name,            `dstip` as destination,             sum(total_num) as sessions,             sum(total_num) over (partition by `app`) as app_totalnum

    from ###(

        select `dstip`,                `app`,                `appid`,                `hostname`,                count(*) as total_num

        from $log         where $filter               and logid_to_int(`logid`) not in (4, 7, 14, 20)         group by `dstip`,                  `app`,                  `appid`,                  `hostname`         order by total_num desc     )### as t     group by app_id,              application,              host_name,              destination,              total_num     order by app_totalnum desc,              sessions desc ) as a group by app_id,          application,          host_name,          destination,          app_totalnum having sum(sessions)>0 order by app_totalnum desc,          sessions desc

 

 

AtiT

AtiT
itismo
New Contributor

 

That worked! You are definitely great and saved me 

 

How long did it take you to figure those queries? Do you often need to edit/create datasets and you are already an SQL expert?

 

I would really love to thank you personally, I will be in prague later this month :)

 

Labels
Top Kudoed Authors