Data Loss Prevention

 View Only

SQL Query for Average Scan Speed Across Whole Estate – Discover 

Feb 08, 2019 10:06 AM

The Brief

This article builds on a query covered in one of my previous pieces. As usual, the full query is at the bottom if you want to skip right to it. If not, I’ll go over the context and walk through why I’m doing what I’m doing. If you're interested in seeing a SQL query to show the full DLP Scan History, click here.

This task arose from a requirement from our client to have their whole estate scanned. An illustration of the previous query dealing with this is shown below:

 

As consultants, we provide evaluations of the client estate and from this, estimate timescales, costs, scan speeds etc. Therefore, being able to proactively monitor the client environment and tackle issues before they arise is of paramount importance. Nobody wants to feel as if their time/resources are being under-utilised.

In our case, we had undertaken a project to use DLP to scan a large client estate (Petabytes and petabytes). The original query I created for this would be fine, if not for one problem. The client has multiple scanners in use simultaneously. Therefore, a further solution was needed.

This query is the result of that process. A wordy way of putting it would be; This query provides an average of the average speeds per hour, in GBs across an entire estate.

Here is what the query looks like when it has been ran:

 

It may not look like much. That being said, if you calculated the project timescales, resources needed and costs based on 40GB/hour, seeing this could set off a lot of alarm bells for you right now.

 

The Maths

The meat of this query revolves around calculations. The first calculation is to transform bytes into GBs, the second is to transform the milliseconds into hours and the final calculation is to divide the data transferred by the amount of time taken. This is all wrapped up nicely in a SUM function. As before, we're pulling from the WALK table.

The syntax is:

AVG(TRUNC((TRUNC(BYTESSCANNED*1e-9,2) ) / NULLIF(ELAPSEDTIME, 0) * 3600000,2)) AVG_AVG

As you can see, I’ve also had to implement NULLIF. The reasoning behind this is that when a scan is queued, even if not doing anything, DLP commits this to the database. Naturally, scans that haven’t done anything yet won’t have values against columns such as BYTESSCANNED or ELAPSEDTIME.

I'm also fairly certain your aliases will be better than mine.

Following that thought, if we’re doing calculations, we don’t want to break the universe by dividing by 0. Hence the error handling.

Last thing to mention on this point is the results are truncated to two decimal places. This is mainly to ensure the data is readable in practice by the recipients of the report.

 

The WHERE Clause

In this query, the where clause is almost as important as the calculation itself. As we’re dealing with an aggregate function, any rows which don’t add (pun absolutely intended) to the result in a meaningful way need to be discarded, else the results will be distorted.

You’ll see below that I remove any results which are 0 from the result set from the BYTESSCANNED, ELAPSED TIME, and result of the AVG_AVG columns.

Syntax:

WHERE BYTESSCANNED NOT LIKE '0' 
AND ELAPSEDTIME NOT LIKE '0'
AND TRUNC((TRUNC(BYTESSCANNED*1e-9,2) ) / NULLIF(ELAPSEDTIME, 0) * 3600000,2) NOT LIKE '0'

 

The Conclusion

The full query is below as promised. What we’ve seen from the above is that the exploration of SQL in relation to the DLP product is consistently yielding more powerful tools which can be used to enhance the product. In this case it has led to a direct way of monitoring a project in practical terms.

 

I hope this article has helped.

 

Thanks

 

Chris

SELECT
AVG(TRUNC((TRUNC(BYTESSCANNED*1e-9,2) ) / NULLIF(ELAPSEDTIME, 0) * 3600000,2)) AVG_AVG
FROM WALK
WHERE BYTESSCANNED NOT LIKE '0' 
AND ELAPSEDTIME NOT LIKE '0'
AND TRUNC((TRUNC(BYTESSCANNED*1e-9,2) ) / NULLIF(ELAPSEDTIME, 0) * 3600000,2) NOT LIKE '0';

 

Before You Go

If you would like to read my other articles on SQL-DLP you can find them here:

 

Statistics
0 Favorited
1 Views
0 Files
0 Shares
0 Downloads

Tags and Keywords

Comments

Oct 06, 2019 04:14 AM

grt post.

Apr 01, 2019 05:13 PM

Great post, Christopher! 

Related Entries and Links

No Related Resource entered.