Skip to content

False positives for TFtAccumulativeTimeDetector #8

@krizhanovsky

Description

@krizhanovsky

We can’t just select the clients with the highest cumulative response time, because the most popular fingerprint generates the most number of requests, each with small response time, producing huge cumulative response time. Also we can’t just query the top request_time - there are a lot of rare outliers, produced by normal clients. An intersection of top 20 talkers with top 20 cumulative response time give us more meaningful result:

SELECT hex(tft) AS tft_hex, sum(response_time) AS tot_resp_time, count() AS tot_req,
          sum(response_time)/count() AS avg_resp_time
   FROM access_log
   GROUP BY tft
   HAVING tft IN (
       SELECT tft FROM (
           SELECT tft, sum(response_time) AS s
           FROM access_log
           GROUP BY tft ORDER BY s DESC LIMIT 20
       )
   ) AND tft IN (
       SELECT tft FROM (
           SELECT tft, count() AS c
           FROM access_log
           GROUP BY tft ORDER BY c DESC LIMIT 20
       )
   ) ORDER BY avg_resp_time DESC;

But I'm still not sure about accuracy of this query - TBD.

Metadata

Metadata

Assignees

No one assigned

    Labels

    crucialquestionFurther information is requested

    Type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions