Helpful ReplyHot!Not Equal Results between Dataset and Report Chart

Author
AtiT
Platinum Member
  • Total Posts : 469
  • Scores: 42
  • Reward points: 0
  • Joined: 2012/04/18 12:13:27
  • Location: Prague / Czech Republic
  • Status: offline
2016/12/08 15:22:37 (permalink)
0

Not Equal Results between Dataset and Report Chart

Hello,
I have a FAZ-VM running on 5.2.7. I have a problem with the different output when I test the dataset and when I generate the report.
 
I issued the following commands but the results are the same:
# diagnose sql remove hcache
All hcache tables will be erased!
Do you want to continue? (y/n)y
Done.
# diagnose sql remove query-cache
Done.
# diagnose sql remove tmp-table
Done.
 
What I get is attached.
Is it a bug or I missed something?

Attached Image(s)


AtiT
--------------------
NSE 8, CCNP R+S
#1
hzhao_FTNT
Expert Member
  • Total Posts : 347
  • Scores: 60
  • Reward points: 0
  • Joined: 2014/09/12 10:03:54
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2016/12/22 10:54:23 (permalink)
0
Hi AtiT,
 
As you know, report is generated by merging many hcache tables, but dataset test console not. All datasets in report must follow certain rules. Could you share your dataset?
 
Thanks,
hz
 
#2
AtiT
Platinum Member
  • Total Posts : 469
  • Scores: 42
  • Reward points: 0
  • Joined: 2012/04/18 12:13:27
  • Location: Prague / Czech Republic
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2017/01/03 01:19:17 (permalink)
0
Hello hzhao,
 
This is the dataset for the Event log type:
 
SELECT wtime, COUNT(*) AS clients
FROM (
SELECT t1.time AS wtime, t1.source AS wsrce, smac AS wsmac
FROM (
    (SELECT TO_CHAR(TO_TIMESTAMP(`itime`)::timestamp, 'YYYY-MM-DD') AS time,
    IPSTR(`srcip`) AS source,
    COUNT(*) AS totalnum
    FROM $log
    WHERE $filter
    AND `action`='disclaimer'
    AND `logdesc`='Disclaimer accepted'
    AND `status`='success'
    AND IPSTR(`srcip`)!='0.0.0.0'
    AND NULLIFNA(IPSTR(`srcip`)) IS NOT NULL
    GROUP BY time, source
    ORDER BY time ASC
    ) AS t1
    INNER JOIN
    (SELECT TO_CHAR(TO_TIMESTAMP(`itime`)::timestamp, 'YYYY-MM-DD') AS time,
    IPSTR(`srcip`) AS source,
    `stamac` AS smac,
    COUNT(*) AS totalnum
    FROM $log
    WHERE $filter
    AND `subtype`='wireless'
    AND `action`='client-ip-detected'
    AND IPSTR(`srcip`)!='0.0.0.0'
    AND NULLIFNA(IPSTR(`srcip`)) IS NOT NULL
    GROUP BY time, source, smac
    ORDER BY time ASC
    ) AS t2
    ON t1.source=t2.source
)
GROUP BY wtime, wsrce, wsmac
ORDER BY wtime ASC
) AS x
GROUP BY wtime
ORDER BY wtime ASC
 

AtiT
--------------------
NSE 8, CCNP R+S
#3
hzhao_FTNT
Expert Member
  • Total Posts : 347
  • Scores: 60
  • Reward points: 0
  • Joined: 2014/09/12 10:03:54
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2017/01/03 15:33:29 (permalink)
5 (1)
Hi AtiT,
 
You have to put ### for inner queries:
###(...)### AS t1 INNER JOIN ###(...)### AS t2
 
regards,
hz
#4
AtiT
Platinum Member
  • Total Posts : 469
  • Scores: 42
  • Reward points: 0
  • Joined: 2012/04/18 12:13:27
  • Location: Prague / Czech Republic
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2017/01/04 00:35:46 (permalink)
0
Thank you very much, the ### fixed this issue.
Can I read more about the hcache and where to use the ### somewhere?
 

AtiT
--------------------
NSE 8, CCNP R+S
#5
hzhao_FTNT
Expert Member
  • Total Posts : 347
  • Scores: 60
  • Reward points: 0
  • Joined: 2014/09/12 10:03:54
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2017/01/04 10:20:25 (permalink)
0
When report start to run, system will check ### for all subqueries which need to be merged first. Since there are many log tables will be queried, based on your original dataset, both sub and main queries will be executed for each log table, then system merge the results. But for modified dataset, only subqueries inside ###()### will be merged first, then execute main query at last. 
 
regards,
hz
#6
AtiT
Platinum Member
  • Total Posts : 469
  • Scores: 42
  • Reward points: 0
  • Joined: 2012/04/18 12:13:27
  • Location: Prague / Czech Republic
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2017/11/28 05:51:52 (permalink)
0
Hello,
I have a problem with the cached results again. I have a different results for two simple queries. Can anyone test it and say what is wrong or why the report has different output?
Shouldn't be the same?
 
===== test1 =====
SELECT `srcip` AS source, COUNT(*) AS totalnum
FROM $log WHERE $filter
GROUP BY source
ORDER BY totalnum DESC
 
===== test2 =====
SELECT source, totalnum FROM ###(
SELECT `srcip` AS source, COUNT(*) AS totalnum
FROM $log WHERE $filter
GROUP BY source
ORDER BY totalnum DESC
)### a
 
Tested on FAZ 5.4.3 and FAZ 5.6.0.

Attached Image(s)


AtiT
--------------------
NSE 8, CCNP R+S
#7
hzhao_FTNT
Expert Member
  • Total Posts : 347
  • Scores: 60
  • Reward points: 0
  • Joined: 2014/09/12 10:03:54
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2017/11/28 10:33:56 (permalink)
5 (1)
Hi there,
 
 
Test1 looks good to me, but it  seems you missed an aggregation function in test2, so the totalnum will be only from one log table. 
SELECT source, sum(totalnum) as totalnum FROM ###(
SELECT `srcip` AS source, COUNT(*) AS totalnum
FROM $log WHERE $filter
GROUP BY source
ORDER BY totalnum DESC
)### a group by source order by totalnum desc
 
regards,
Huai zhao
#8
AtiT
Platinum Member
  • Total Posts : 469
  • Scores: 42
  • Reward points: 0
  • Joined: 2012/04/18 12:13:27
  • Location: Prague / Czech Republic
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2017/11/29 01:13:42 (permalink)
0
Hello,
Yes, that's look good for me. Thank you!
 
I just do not understand why I should do the sum() on the totalnum column as I have the results from the previous SELECT 'table a' and I do a SELECT on this result. The table exists just another select to show all entries from that table.
In other words -> show me the same as previous select = show me the results of table 'a' = show me the same results as test1.
 
Probably it will be some internal information that is related to caching and not publishable, I just want to understand how things work and why I am doing something...

AtiT
--------------------
NSE 8, CCNP R+S
#9
hzhao_FTNT
Expert Member
  • Total Posts : 347
  • Scores: 60
  • Reward points: 0
  • Joined: 2014/09/12 10:03:54
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2017/11/29 10:58:32 (permalink)
5 (1)
Hi AtiT,
 
Depend on log rate, we need to query many log tables during report running. If we directly query all log tables, it will be easy to see "out of memory" in DB server, that's the reason why we developed hcache system. Under current hcache mechanism, the inner query with "###()###" will be applied to each log table, one log table will generate one hcache table(this will be done automatically when you schedule a report or enable auto-cache on it). During report running, the outer query will only work on those hcache tables, not log tables. 
 
Dataset test console is for testing purpose, it will query log tables directly, please do not select long time period if your log rate is high. In your case, test2 should work fine when you only have one log table, but if you have many log tables, it will take totalnum from the first log table only.
 
Below are some common issues which may not be detected by dataset test console:

•"$filter" is not applied
•No "###" for inner query
•"distinct/limit" is used in inner query --
for example:
wrong: select sum(user_count) as user_count from ###(select count(distinct `user`) as user_count ......
right: select count(distinct user_src) as user_count from ###(select `user` as user_src from ... group by `user`...
•aggregation function only applied to inner query ---- like test2 in your case
•No column alias for column with function--
for example:
wrong: select count(*) from $log...
right: select count(*) as total_num from $log....
•No “order by” -- in some cases, these will affect report accuracy
•Try to join log tables (for example, join traffic log with IPS log)
 
I hope these info helped.
 
Regards,
hz
#10
AtiT
Platinum Member
  • Total Posts : 469
  • Scores: 42
  • Reward points: 0
  • Joined: 2012/04/18 12:13:27
  • Location: Prague / Czech Republic
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2017/12/11 05:35:20 (permalink)
0
Hello hzhao_FTNT
Now it is clear for me.
Thank you very much! 

AtiT
--------------------
NSE 8, CCNP R+S
#11
shaneboy
New Member
  • Total Posts : 5
  • Scores: 0
  • Reward points: 0
  • Joined: 2019/10/18 09:36:31
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2019/11/06 03:15:25 (permalink)
0
Hi, I found this thread and I wanna know if you can help me. I need to do a dataset for a report where show this fields:
app    |   sum(bandwidth) of today   |   average of sum(bandwidth) of the last N days
my query is this one:
select t1.app,(sum(coalesce(t1.sentbyte, 0)+coalesce(t1.rcvdbyte, 0)))/$days_num as bandwidth_historico,t2.bandwidth_hoy
from $log t1 inner join
###(select app,sum(coalesce(sentbyte, 0)+coalesce(rcvdbyte, 0)) as bandwidth_hoy
from $log where $filter and (logflag&1>0) and
nullifna(app) is not null and to_timestamp(itime)::date >= current_date
group by app having sum(coalesce(sentbyte, 0)+coalesce(rcvdbyte, 0))>0 order by bandwidth_hoy desc
)### as t2 on t1.app=t2.app
 where $filter and (logflag&1>0) and
nullifna(t1.app) is not null group by t1.app,t2.bandwidth_hoy having sum(coalesce(t1.sentbyte, 0)+coalesce(t1.rcvdbyte, 0))>0
order by bandwidth_hoy desc

 
So, the first column would be for the app, the second will be the sum of the bandwidth and then it will be divided by the number of days selected in the $filter, and the third will be the bandwidth of the present day.
And then we Join that fields with a subquery where it is indicated that it will select today's records exclusively.
And it works on the dataset section. It looks like the right table below. It works, I checked the results of the third column with another dataset that I made to see if the data was the same.
The problem ocurrs when I run the report where the chart that uses this query, it shows the same app more than once in some cases (red oval). And it's not even the same info although is the same query with the same time parameters (blue and green ovals). The right one is the correct one.
And I'm not sure is there's something wrong or that it could improve on my query to show the correct data on the report. Or maybe do I have to do something like the example above? something like:
 
select field1,field2 from ###( select...) ### as t1 inner join ###(select...)### as t2

on t1.app = t2.app

 
I hope you could help me.
Regards.
 

Attached Image(s)

#12
hzhao_FTNT
Expert Member
  • Total Posts : 347
  • Scores: 60
  • Reward points: 0
  • Joined: 2014/09/12 10:03:54
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2019/11/08 08:52:03 (permalink)
0
Hi shaneboy,
 
Because of the hcache, do not join $log with other tables.
Try below query:
select app, sum(case when dom=to_char(now(), 'YYYY-MM-DD') then bandwidth else 0 end) as today_bandwidth, cast(sum(bandwidth)/count(distinct dom) as decimal(18,0)) as avg_bandwidth from ###(select app, $DAY_OF_MONTH as dom, sum(coalesce(sentbyte, 0)+coalesce(rcvdbyte, 0)) as bandwidth from $log where $filter and (logflag&1>0) group by app, dom order by bandwidth desc)### t group by app order by avg_bandwidth desc
 
Regards,
hz
#13
shaneboy
New Member
  • Total Posts : 5
  • Scores: 0
  • Reward points: 0
  • Joined: 2019/10/18 09:36:31
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2019/11/08 21:21:46 (permalink)
0
Thanks hzhao_FTNT,
 
Your query really helped me but it seems that's not working properly when the "today_bandwidth" comes. The result is always 0, the "avg_bandwidth" works fine. When I choose "today" as the time period it does work on both columns, but if I choose another time period like "7 days", "last N days" or anyone else. It shows 0 :/
I thought it was because of the condition on the "Case" condition. It compares the function now() casted to char versus $DAY_OF_MONTH but the now() function shows date like this:
2019-11-08
and $DAY_OF_MONTH shows just the number of the day, so it would be comparing something like this "2019-11-08" = 5.
I made a few changes to your query to compare the current date with a field and now it is like this:
 
select app, sum(case when dom>=current_date then bandwidth else 0 end) as today_bandwidth,
cast(sum(bandwidth)/$days_num as decimal(18,0)) as avg_bandwidth
from ###(select app, from_itime(itime)::date as dom, sum(coalesce(sentbyte, 0)+coalesce(rcvdbyte, 0))
as bandwidth from $log where $filter and (logflag&1>0)
group by app, dom order by bandwidth desc)###
t group by app order by avg_bandwidth desc

 
But it's not working neither on the today_bandwidth, the avg one works fine. Any ideas of why is not working? I'm kinda out of ideas.
 
Regards.
post edited by shaneboy - 2019/11/08 21:23:59
#14
hzhao_FTNT
Expert Member
  • Total Posts : 347
  • Scores: 60
  • Reward points: 0
  • Joined: 2014/09/12 10:03:54
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2019/11/12 08:58:23 (permalink) ☄ Helpfulby shaneboy 2019/11/12 17:02:20
5 (1)
Report "last 7 days" doesn't include today's logs. Because $filter will automatically add device and time filter to your query, please make sure you have correctly set time period in report settings. You can use Today/This Week/This Month/This Year or Custom Period.
#15
shaneboy
New Member
  • Total Posts : 5
  • Scores: 0
  • Reward points: 0
  • Joined: 2019/10/18 09:36:31
  • Status: offline
Re: Not Equal Results between Dataset and Report Chart 2019/11/12 17:04:32 (permalink)
0
Hi hzhao_FTNT,
 
It worked fine, thanks a lot!
#16
Jump to:
© 2019 APG vNext Commercial Version 5.5