Month: March 2015

Regression test-suite for BI solution – Part 1

In this series of blog posts, I’ll explain how to setup a regression test-suite for a BI solution (more specifically a BI solution consisting of an olap cube build with SQL Server technology). Well, not just create a new test-suite, but how to automate the process of creating this test-suite. I’ll explain how to achieve this automation with the help of NBi and genbiL. GenbiL is a language to script the creation of test-suites for NBi.

First point, what’s a regression test-suite: the goal of this kind of test-suite is to ensure that the changes introduced in the solution during the last release have no unexpected side effects on existing features. For BI systems, it means testing that the introduction of a new dimension is not changing the aggregation results on the other dimensions or testing that the performance improvements in your ETL don’t affect the effective end-result in your data warehouse.

One way to achieve such result is to replay the tests created during the previous releases. To be able to apply this strategy, you must have started with automated testing since your first release … it’s not often the case. More, after a few releases maintaining the backlog of tests becomes really cumbersome and time-consuming.

Another approach is to execute the same actions (queries, …) on two instances of your solution (next and current releases). You’ll need to ensure that the results of these actions (result-sets, performances, …) are the same on both versions. The great benefit of this strategy is that the actions (queries, run of etls, …) can be automatically generated. If we apply this approach on a cube, we could generate “random” queries crossing the dimensions and measures. The comparison of the results of the queries on the two instances will tell us if some regressions have been introduced (or not) in the new release (According to the idea that the cubes are filled with the same data). That’s the basic idea and we’ll explore it.

First point: we need a list of existing dimensions, hierarchies, in some cases levels and measures. One option is to create this list manually, but a better approach is to extract this list from the cube (The one currently in production). To achieve this, we’ll use Dynamic Management Views (DMVs) and more specifically MDSCHEMA_DIMENSIONS and MDSCHEMA_HIERARCHIES. These two DMVs will return two listings of existing dimensions and hierarchies for a given cube.

For the dimension, we’ll extract the (unique) name of the dimension (useful to write a query) and its caption (useful when comparing structure’s tests). Note that you need to filter on the cube’s name in the where clause.

select
	[Dimension_Unique_Name]
	, [Dimension_Caption]
from
	$system.MDSchema_Hierarchies
where
	[Cube_Name]='Adventure Works'

You can test this query by starting Sql Server Management Studio, connect to your cube and create a new MDx query, copy/paste this query before pressing F5 to run your query.

The following query is extracting relevant information about hierarchies. In this case we’ve also added the unique name of the parent dimension. This will be useful to join together the two datasets (dimensions and hierarchies).

select
	[Dimension_Unique_Name]
	, [Hierarchy_Caption]
        , [Hierarchy_Unique_Name]
from
	$system.MDSchema_Hierarchies
where
	[Cube_Name]='Adventure Works'

Unfortunately, it’s not possible to execute joins directly on the DMV (The query engine for DMV supports a limited set of keywords). So it’s impossible to join the results of two DMV. This problem will be solved by genbiL but we’ll explain how later.

Now, back to the DMV and their execution, we need to script the execution of these queries. Genbi is a tool to support the creation of test-suites in the format understood by NBi. genbiL is a macro language for Genbi. We’ll script in this language and for this we need to chose a text editor (Notepad++ will do the job). The good news is that genbiL supports to load test-cases from queries, so we can load the needed information from the DMV.

The syntax is quite easy:

case load query
{
  select
	[Dimension_Unique_Name]
	, [Dimension_Caption]
  from
	$system.MDSchema_Dimensions
  where
	[Cube_Name]='Adventure Works'
}
on 'myConnectionString';

Just replace the ‘myConnectionString’ by the expected connection string of your cube.

Once you’ve loaded the set of test-cases, you can apply it to a template. In this case choose the predefined template named ‘dimension exists’.

The syntax is quite easy:

template load predefined 'ExistsDimension';

This template is expected values for the a variables named perspective and dimension but these variables are not immediately available in the set of test-cases. First we need to rename [Dimension_Caption] into dimension.

case rename column 'Dimension_Caption' into 'dimension';

Then we need to create a column perspective and assign the value ‘Adventure Works’ to this newly created column.

case add column 'perspective' values 'Adventure Works';

Now that the set of test-cases and the template are correctly loaded, we just need to generate the test. It’s done with the sentence:

suite generate;

Before saving our test-suite we need to define the two connection strings as defaults for system-under-test and asserts.

setting default sut connectionString '...connection string to the next release ...';
setting default assert connectionString '...connection string to the current cube ...';
suite save 'myRegression.nbits';

Your script should look like

case scope 'dimensions';
case load query
{
  select
	[Dimension_Unique_Name]
	, [Dimension_Caption]
  from
	$system.MDSchema_Dimensions
  where
	[Cube_Name]='Adventure Works'
}
on 'Provider=MSOLAP.4;Data Source=.\SQL2014;Initial Catalog=Adventure Works DW 2012;Locale Identifier=1033';

template load predefined 'ExistsDimension';
case rename column 'Dimension_Caption' into 'dimension';
case add column 'perspective' values 'Adventure Works';

suite generate;
setting default sut connectionString 'Provider=MSOLAP.4;Data Source=.\SQL2014;Initial Catalog=Adventure Works DW 2012;Locale Identifier=1033';
setting default assert connectionString 'Provider=MSOLAP.4;Data Source=.\SQL2014;Initial Catalog=Adventure Works DW 2012v2;Locale Identifier=1033';
suite save 'myRegression.nbits';

If you execute this script with genbiL.exe (just pass the path of your script as an argument) then genbil will generate a test-suite with around 25 tests, asserting the existence of the different dimensions. Observing carefully the test-suite, you’ll probably see that this suite contains a test for a dimension Measures. You’ll probably want to filterout this special dimension. For this just apply a filter to your set of test-cases. To filter out this dimension include the following line before generating the test-suite.

case filter on column 'dimension' values not equal 'Measures';

An interesting point is that you can use your script from release to release and you don’t need to maintain manually a list of all your dimensions. You can now easily add the tests of existence for hierarchies by yourself.

An interesting question would be, how to generate tests of type subsetOf. You need to use, the “grouping” option. But before, you need to ensure that the column ‘dimension’ is the last one. For this the first step will be to ensure that we haven’t unneeded column, the best approach is to use the hold command and then moving the column dimension to the correct position.

case hold column 'perspective', 'dimension';
case move column 'dimension' to last;

The you need to select the correct template and then finally generate the test suite with the grouping option:

template load predefined 'SubsetOfDimensions';
suite generate grouping;

This will generate one additional test, listing the expected dimensions.

In next part, we will focus on crossing two set of test-cases and generate queries.

Advertisements