Evaluate an expression to validate the result of a query

During the validation process of your BI solution, it could be useful to check that a given expression is validated for each row returned by a query. A sample? Yes surely, you’ve a query returning all the customers having bought for more than $10,000 in your stores. You should ensure that the query returns only rows with a column TotalDue greater than 10000. If we apply this query to AdventureWorks2012, it should look as

select
	[FirstName] + ' ' + [LastName]
	,TotalDue
from
	[Person].[Person] pers
inner join
	(
		select
			[pers].[BusinessEntityID]
			,sum(TotalDue) as TotalDue
		from
			[Sales].[SalesOrderHeader] soh
		inner join
			[Sales].[Customer] cust
			on [cust].[CustomerID] = [soh].[CustomerID]
		inner join
			[Person].[Person] pers
			on [pers].[BusinessEntityID] = [cust].[PersonID]
		group by
			[pers].[BusinessEntityID]
		having
			sum(TotalDue)>10000
	) TotalDueByPerson
	on [TotalDueByPerson].[BusinessEntityID]=[pers].[BusinessEntityID]

The most common assertion for datasets within NBi is equalTo but it can’t help you for this task. Indeed, the assertion equalTo requires that you know exactly the count of rows that will be returned by the query. If you compare two datasets with different count of rows, the test will surely turn red complaining about missing or unexpected rows.

So if equalTo is not the graal, we should take a look to the another assertion dedicated to datasets’ validation: evaluate-rows. The assertion evaluate-rows is checking that a list of expression are validated for each row of the dataset. If at least one of these expressions for at least one row is not validated, then the test will turn red.

The assertion evaluate-rows is mainly built to support calculations on based on values and compare the result of the expression’s evaluation to a specific column. To support the case explained above, we will need to be creative. The idea is to validate the content of the cell TotalDue based on it’s own value. Indeed, we will express in our test that if the value of column TotalDue is greater than 10,000 then we’ll return this value as the the expression’s result, else we’ll return -1. When the framework will compare the value of the column and the value of the expression, each row with a TotalDue greater than 10,000 will validate and if the amount is less than (or equal to) 10,000, this value will be compared to -1 … and this row will fail the test.

<assert>
    <evaluate-rows>
         <variable column-index="1">TotalDue</variable>
         <expression column-index="1">if(TotalDue>10000,TotalDue,-1)</expression>
    </evaluate-rows>
<assert>

Tested with the query above, the test will success.

If we’re changing the query above to filter on TotalDue greater than 50 in place of 10,000

having
 sum(TotalDue)>50

The test will return a failure and turn red with following message:

7263 of the 7309 rows don't validate at least one expression.

Row 0: 
    The expression ' = if(TotalDue>10000,TotalDue,-1)' is not validated.
    The expected result was '-1' but the actual value is '162.7444'
    Variable 'TotalDue' had value of '162.7444'

Row 1: 
    The expression ' = if(TotalDue>10000,TotalDue,-1)' is not validated.
    The expected result was '-1' but the actual value is '3019.6307'
    Variable 'TotalDue' had value of '3019.6307'

Full test:

<?xml version="1.0" encoding="utf-8" ?>
<testSuite name="Acceptance Testing: members ordering" xmlns="http://NBi/TestSuite">
  <test name="Blog Post: Evaluate's expression">
    <system-under-test>
      <execution>
        <query connectionString='Data Source=mhknbn2kdz.database.windows.net;Initial Catalog=AdventureWorks2012;User Id=sqlfamily;password=sqlf@m1ly'>
          <![CDATA[
          select
          [FirstName] + ' ' + [LastName]
          ,TotalDue
          from
          [Person].[Person] pers
          inner join
          (
          select
          [pers].[BusinessEntityID]
          ,sum(TotalDue) as TotalDue
          from
          [Sales].[SalesOrderHeader] soh
          inner join
          [Sales].[Customer] cust
          on [cust].[CustomerID] = [soh].[CustomerID]
          inner join
          [Person].[Person] pers
          on [pers].[BusinessEntityID] = [cust].[PersonID]
          group by
          [pers].[BusinessEntityID]
          having
          sum(TotalDue)>50
          ) TotalDueByPerson
          on [TotalDueByPerson].[BusinessEntityID]=[pers].[BusinessEntityID]
          ]]>
        </query>
      </execution>
    </system-under-test>
    <assert>
      <evaluate-rows>
        <variable column-index="1">TotalDue</variable>
        <expression column-index="1" type="numeric" tolerance="0"> = if(TotalDue>10000,TotalDue,-1)</expression>
      </evaluate-rows>
    </assert>
  </test>
</testSuite>
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s