BI solutions

Use test-cases to validate your design with stakeholders

This post is part of a series about specifications, requirements, work items and test-cases in the field of Business Intelligence solutions. First part is about why I hate and I’d not recommend specifications, the second is about how to write good requirements for BI solutions. Then, in the third part I set the focus on work items and what does it mean according to me. I’ll now explain the role of test-cases in this strategy to deliver good BI solutions.

Let’s go back and see what is on the table until now:

  • Requirements:
    • A description of the business process to be analyzed
    • List of “what” (and not “how”) the stakeholders are expecting to do with our solution.
    • Information about the end-users (experience, …)
  • Work items
    • A general design, the solution that I’m planning to build
    • A list of deliverables to be implemented, each of them is a small and independent (meaning that they can be discarded or postponed).

Until now, we haven’t validated anything. General design and work items are just the team’s vision of what could be the best solution for the end-users. If we’re implementing this, we take a lot of risks. Too many! If something is not correctly described in the requirements or something not understood in the business process, we’ll only know it when revealing the solution to the end-users, it will be too late. We must have a hand shake between the implementation team and the stakeholders before implementing the whole solution, ideally before developing the first lines of code.

invisible-handshake-with-giant

Keep in mind that you should never ask to the stakeholders to give a “go” for the whole solution. Will you ask them to confirm that the solution you’ve in your head is perfectly what they want with just a high level description (slides, whiteboard, …). It’d just be impossible (even for you). That’s the same than expecting from you to give a precise amount of hours to implement a solution with just a few lines of requirements.

We need to have a validation from stakeholders but what can we ask them to validate? Work items? The main issue with work items is the difficulty for end-users to have a good vision of what will be the solution: “What they will have in their hands”! Work items are usually too small to give them this overview. If we ask them to validate something else than the work items, we’ll also be responsible that work items and this “other thing” are related to each others during the whole implementation process.

So, the first option is to write a 3000 pages document with a really detailed explanation of what will be built. Cross your fingers that the stakeholders have time to read it (by experience, they won’t … and yes it will be your problem now or later but it will be). If they have time let me guess that won’t understand more than 50% (again sooner or later, it will be an issue for you).  Don’t forget to pray that you’ve not make some small mistakes in this description, or it will be your responsibility! You’ll need to deliver it (even if it’s not possible)… and don’t forget that this document must be written before the development so without any compiler to check your rough ideas. Oh, finally, you’ll need to maintain the link between this document and your work items. From my experience, go an buy a lottery ticket, you’ve more chances to succeed.

lottery-ticket.png

A second option? Write test-cases. Most of the industry practices ask stakeholders to validate “designs” or “specifications” but rarely “test-cases”. When you think about it, it’s weird: test-cases are much more precise and less abstract than the two others, so it’s easier to understand for the stakeholders and we’ve less opportunities to discuss the interpretation of a test-case than the interpretation of a specification. Funny. From my experience of the software industry, it’s usually because consultancy company don’t write test-cases … they are relying on end-users to effectively tests! And both (stakeholders and consultants) think that they are saving money with this “strategy”. Are we serious?

are-you-serious

Let’s go back to test-cases. A common definition is “A test-case is a set of conditions under which we will determine whether an application, software system or one of its features is working as it was originally established for it to do”.

We can reverse the definition and say that “the way the application, software system or one of its features is expected to work is define by a set of conditions named a test-case”. It means that to express the result that I want (the what), I’m explaining what I’ll execute(the how). It gives a very good opportunity to keep the discussion with the stakeholders at the what-level and not the how-level. When you’re validating the test-case, you ask them to validate that it’s effectively the expected result … and not that it’s effectively how they were dreaming to implement it.

The difference between validating a “specification” and a “test-case” can be explained in a few words: Don’t ask them to validate a formulae, ask them to validate the output of the formulae when it receives a given set of parameters in input. That’s completely different.

Let me be clear on one important topic: validating a test-case has two points. The first is obvious, when you’re giving these parameters, you’re expecting this result. The second is less obvious ; you must also validate that the end-users are effectively able to give the parameters. For many reasons (usage of codes in place of labels, data not available at the moment of the request, …), the solution could not be implementable or usable!

How should you build the test-cases? From the end-user point of view: Show them the report or pivot table that are expecting/will receive. Don’t bother them with validation of the ETL … they don’t care and it’s not their problem, it’s yours. I agree that 50% of the errors will be in the ETL but it’s not something that you should discuss with end-users!Focus on deliverables: a dimension, a measure-group, a report and alert!

point-of-view.png

The big issue with BI solutions is usually that when we’re speaking about input parameters, we’re speaking a thousand or a million values. Building datasets for test-cases can be a huge task. My recommendation is to use real and already existing data as much as possible. If you can use production data (of the previous years). If you can’t argue for it and show to stakeholders the cost of building datasets … they could change their mind. If it’s really not possible, you’ll need to build datasets. But in both cases, really think about the need to assert the final result on all the data, a subset won’t be enough.

To illustrate this, I’ll explain my own case. At my current job, we’ve complex aggregations over time and the daylight savings switches are a nightmare for us. Some testers would try to validate all the aggregations at the day level for the 14 years of data. It will take them days to implement  … Who will validate such a result? do you really think that someone will spent his time to validate 5000 complex sums/averages/weighted averages? No one. I’m recommending to test one standard day and to test the two daylight savings of a same year. 3 tiny test-cases versus one huge test-cases.

There is another big advantage to use tiny test-cases versus huge test-cases. When my test will be successful for the standard day and the summer daylight saving but not for winter, I’d directly understand that I’ve a problem in loading my data for the winter shift and nowhere else. But when your huge case is failing, you must first analyze it deeply to check what is exactly failing and it could take a lot of time before realizing that it’s just for the winter shift that you’ve an issue.

But what will happens if my solution is correct for these 3 days but not for the others? I could have a bug that the solution is not loading the data before the year 2015! That’s a possibility and it’s why I should have some tiny tests with aggregation at the year level to ensure that I’m not only loading the last years of data. … but check my words … my test has changed! I’m not anymore testing that I’ve a correct aggregation, I’m testing that I’ve plausible values! Something that is usually much more quick to write and to validate.

The building of the set test-cases should be done with stakeholders. lets go back to an illustration from the real world. A few years ago, on of my testers spent many hours to implement a mock-up of a pivot table. The mock-up was with real figures and he spent a lot of times to calculate the expected content of each cell (around 16). After one day of work, he met the subject matter expert. In less than one second, the answer was there: “No, something is wrong!”. How could he says in less than 10 seconds that it was wrong. First, we had an issue with units (KW or MW), it means that all the figures had been a thousand times larger than expected. for him it was obvious, for us we had no idea if it was 45 or 45.000. In less than one second the subject-matter expert could identify that something was wrong but also explain how to fix it. Great! But there was something else, another issue. The expert started to move the members in a different sort and then identified another mistake some values were expected to be zero and it wasn’t. The expert explained that the lower part of the matrix should be zero due to the business process and it wasn’t the case. We discussed about the potential root causes of our mistakes and identified a wrong interpretation of a business rule described in the requirements. In just a few minutes we captured 3 potential mistakes in the development (units, sorting of members, business rule misinterpretation) … They continued to work together during one hour and reviewed the whole test-suite, adding test-cases that only subject matter experts could bring on the table and simplifying many test-cases. Great added-value for everyone.

work-together.jpg

At the opposite of writing work items, writing test-cases can be executed by juniors and seniors developers. What … developers writing tests? If your team members or you can’t write tests by themselves how could you hope that you develop with a minimum of bug? How can you anticipate all potential issues?  That’s the definition of testing. If a tester has absolutely no notion of development, I’ve huge doubt that he can anticipate potential issues, it’s just monkey testing. It has some value but does it sound serious to build your testing strategy on exclusively performing monkey testing? I doubt.

Nevertheless, if your team is totally new to testing, it could be useful to add a test manager to your team during a few months (or years). He should be able to review the test-cases with you, help you to identify alternative paths … help your team to go to the next level of maturity in testing. But your developers should write the test-cases by themselves. Not necessarily  write the test-cases on what they are implementing. They can pair with another implementer and one is testing the work-item that the other is coding (and vice versa).

Writing a test-case, is a slow (but valuable) process. I usually recommend a 1:1 ratio between the time of writing (not executing) the test-case and the time to develop the code. Keep in mind that I’m not just writing a test-case on a sheet of paper. I’m also validating my solution with the stakeholders before it’s effectively coded! Reusability? Once validated, I’ll use this artefact to build a robust test-suite that will be automated to validate my development. I’m also building a really good artefact to detect possible issues and impacts with some change requests. Much more than a test that I’ll through away within 10 days. Still not convinced? Have you noticed that I’m not writing documentation … My documentation are my test-cases. With just playing the test-suites, I can confirm that my documentation is inline with my development … living and always up-to-date documentation is priceless.

priceless

 

Writing work items and not tasks

This post is part of a series about specifications, requirements,work items and test-cases in the field of Business Intelligence solutions. First part is about why I don’t want specifications and second is about how to write good requirements. This third part is settings the focus on work items before jumping to the test-cases to validate with stakeholders.

I’m always expecting that the whole team has a good understanding of the business we’re working for (even and especially when it’s a really complex business). The main reason is that it reminds them that our goal is to deliver working software to someone who expects specific benefits from using it … not just deliver a software corresponding to some specifications. To gain this good understanding of the business, everyone need to read (and understand) the requirements.

The key question is “Why do we write work items, if everyone is aware of the requirements?”. The answer is really simple, if you just have a problem, you can have many ways to solve it (probably many poor solutions but some great). First reason to write work items is to help the team to go on the same direction.

What are work items in this case? They are the solution, not the problem.  This solution must be debated with a maximum of team members to be sure that we’re covering all the aspects of the problem. It’s what I’m naming the design. I’m not expecting to have interactions with business stakeholders during this phase, it’s not their role to validate that I’ve a good solution because (at this moment). They will probably not understand what they will effectively receive and should be their unique concerns (the what and not the how). This validation from stakeholders or end-users is for later during the writing of test-cases.

The design phase is not something with a clear start and clear end. It’s something that is starting during the reception of the requirements and is on-going until the end of the implementation. Anyway, you should have a good overview of your final design before starting the implementation.

At the end of this “main” design phase, we need to write the target solution somewhere to be sure we effectively implement what has been debated (No, just a picture of the whiteboard is rarely a good idea). This is a good idea to write down ou design in case we need to adapt it. Indeed if this design is written somewhere, we’ll clearly see the impact of our changes (what to re-do, effect on planning, …) . But also what has been poorly taken into account during this initial design phase … it’s usually an excellent lesson-learnt.

Until now, I’m speaking about a design and not about work items. What’s the difference? Work items are a decomposition of the design. The design is unique, work items are many.

When the team knows what should be done (the solution/the design), we also need to separate the work between the different team members. With one big document describing the solution, it would be a pain. To overcome this pain, I’d recommend to create some work items to subdivise the work to be executed by different resources and indicate the relations between them (precedence, …).

where-to-start

In order to achieve this, I’ve a personal rule: no work item will require more than 40 hours of implementation’s work (dev + test). If at any moment this rule is not validated, the work item must be split (It could be during the design, during the development or during the testing).

But it’s not so easy: We also need to be able to decide that a work item will be discarded due to a lack of time to implement it or a ratio cost to implement versus added-value too low. To achieve this it means that we can’t split the work between layers (database, etl, cube, report). If we need to build a dimension, I can’t split the work by layer and take the risk that we decide that we’ve no time to implement the ETL when the cube, database and reports are already “done“. We always need the implementation in all our layers or this new dimension will be pointless! So don’t split work items by technical layers.

Another good reason to avoid the split by layer is the lack of responsibility of the developers when the work items are expressed this way. Even, if the team has made an amazing job during the design phase, it won’t be perfect. We’ll meet unexpected issues (data quality or technical) and we’ll need to change our plans. When you’re dealing with the dimension from A to Z, you know for sure that any tiny problem during the ETL will create bigger issues during the implementation of the cube. So you need to fix the issue in the ETL … even if it was not initially described in the design. When splitting by layer, it’s really easy for the developer to not care about a tiny problem (to keep his personal planning) and let the next layers fix the issues. It will drive you to a poor solution where all issues are fixed in the last layers with tricks … not something that you’d like to maintain.

What’s a work item? It’s a small part of the global design, that can be implemented in less than 40 hours, discarded of the scope, and that will give a business value to the end-user when implemented.

Added Value - Highway Sign

What about a deployment to QA, a code-review or a refactoring? Are they work items? Not really. Naturally, if you don’t deploy, your solution is pointless but it doesn’t mean that it’s really part of your design. So it’s not a work-item but a task. And the script to deploy? Again not a work-item because it won’t directly give some added-value to the end-users and you can’t really discard this. So again it’s a task not a work-items. Tasks should not be discussed with the end-users (and surely not validated) but must be included in the planning.

Back to the work-items, what does it look like? Our work items are usually entitled:

  • Create a new dimension D
  • Add attributes X and Y to existing dimension T
  • Build a hierarchy Z on existing dimension U
  • Create new measure-group M and link it to dimension D and T
  • Create new report “M by X and Y”

Sometimes, with our experience, we can split a work item into two distincts work items with essential attributes of the dimension in the first-one and some nice-to-have in a second. It will be easier to discard or post-pone the second work-item. But, that’s already an advanced level.

Then there is the description of the work-item and it’s usually where the things are becoming less straightforward … especially for really good developers. The usual problem with poor work items is not in how many we’ve, rarely their duration, and neither on their title but always in the content.

Poor work items are usually too detailed. They don’t write the solution at the conceptual level but at the logical or physical level. Most of the time, they are doing this to help their teammates … but it’s just the opposite that is happening.

An illustration? Surely! If your master data solution has two distinct fields first name and last name and you only need one field name. At the conceptual level, you’ll express that you’re expecting a unique field being the concatenation of first then last name separated by a single space.

Note that I never said if it should be done at the ETL level or in a view just before the cube. I have not explicitly written how to perform the concatenation! It would have been easier for me to write something like first + ' '+ last … except that if the fields first name and last name are not trimmed in the master data, no one would trim them. Why? Because it was not requested! It’s so detailed that it would be too easy for the implementer, to say, I just applied your formulae and it doesn’t work, not my problem. Or even I thought that you specifically wanted to not trim them! Stay at the conceptual level, explicit the result that you want to achieve and not the how-to. If the developer is not sure about the need to trim or not, he will ask the question if you stayed at the conceptual level not if you explicitly requested to do a first + ' '+ last.

Keep in mind that you cannot develop a whole solution without a few bugs. When you develop we’ve a few lifeguards such as a compiler and a few manual runs … writing detailed  work-items is just denying this evidence and telling to others, “take a look I can develop a solution bug-free without a compiler and testing” … everybody knows that you’re wrong. Stay at the conceptual level.

If most of your work items are tasks in disguise, then the development process becomes task-focused (doing things) instead of delivery-focused (creating value) … and that’s usually my root issue with poorly written work items.

doing-things

How to help the juniors without going to deep in the work items?

  1. Before they start to implement, ask them to explain how they will implement the work item. Don’t be too rude with them, keep in mind that you’re asking them to code on a whiteboard (without compiler, …). It’s difficult but at least, you can point them to a few possible issues and especially be sure that they have understood the work item.
  2. During the development, review each couple of hours their code with them. Ask them to explain what and why they are making these choices. Propose alternatives. Let them make their own choices.
  3. Write the tests in parallel of the development and ask them to use the tests during the whole development process (not only at the end when the whole development is done) and discuss any divergence between the code and the tests.

You should notice a few constants in the three suggestions:

  • don’t do their job
  • put them in front of their responsibilities: don’t make the choice for them!
  • work in parallel, not ahead or after … with them.
writer.jpg

Who should write the list of work items? I’ll give two answers to this question. At the beginning, it’s important to make an excellent work when creating the work items. By creating, I mean to decide if it’s one or two or three work items, check the precedence constrains (and avoid loops) and give good titles to these work items. From my point of view, it means that most junior team members won’t be able to do it. If they are, I’d strongly recommend to give them the senior adjective!

This is not a full-time job. If you’ve someone in your team writing work items during days, something is wrong in your process. It usually takes a couple of days of work to write work-items to implement in a couple of weeks. If you’ve a huge team, consider to have several writers but it’s really important that they have a good overview of what others are writing. I’d also recommend to consider to not have one dedicated resource to write the work items, the job-owner must change at every release … minimum. It’s really important that senior implementers keep their hands in the code and do their part of the job by coding … not just by reviewing code and writing work items.

But finally, work-items are alive and they must be regularly updated, reviewed and sometimes discarded. Every team member has the right to propose an adaptation and is responsible to update a work item, not just the initial writer. Every update must be internally debated with a maximum of team members and the initial writer.

You can use the work items in your discussions with stakeholders. It will be your unit to discuss what must be postponed or discarded. If your work items are described with too technical terms, it will be difficult for your stakeholders to discuss at this level … another good reason to stay at a higher level.

Keep in mind that discussing on an artefact to gain a confirmation that it must be prioritized or not is not the same as asking for a validation of the content of the work item. I stated before that work items shouldn’t be validated and I hold on … test-cases will help us to confirm with stakeholders if our design is correct or not! and it’s the scope of the next blog post.

Requirements for BI solutions

A recurrent question that I received after a conferences or during a course sounds really simple but is rather complex to answer: How do you write specifications for Business Intelligence solutions? That will be the starting point of this series of four blog posts about specifications, requirements, work-items and test-cases.

The methodology that I advocate is very business focused. We need to understand business processes, objectives, analytic requirements and associated potential business impact, and eventually project success criteria.

A global Business Requirements deliverable typically covers the following topics: critical business and analytic description including semantic, existing systems and data to support the capabilities required, initial project success criteria, and key issues/risks with tactical and strategic recommendations.

Business Process to be analyzed

Keep in mind that the requirements are not there to define the solution but to explicit the problem. Following this idea, don’t try to model by yourself a report, but describe your business process(es) and your analytical needs. Don’t ask for a report but describe the level of maturity of your end-users and what they will do the data/information. Keep in mind that a BI solution could go further than a data warehouse, a cube or a report but can also be trainings, coaching/support and even new staff members.

I don’t care about the format of the requirements. It could be written on any media that will be readable and shareable in the company. But, keep in mind that implementers should be able to read this document from A to Z. Avoid to write documentation based on hyperlinks as the main mechanism to go from one section to another. It’s great to have hyperlinks to facilitate navigation on some parts of the document, but keep in mind that at any moment they need to be sure about what has been read and what has not been. Based, on this we do not recommend to write your requirements as a website but better on a word document.

Strictly avoid UML or E/R schemas (or any software related notation). Firstly, the level of expertise, with the notation, of each document’s reader/writer will vary, resulting in different and incorrect interpretations of a same schema. Some readers/writers will understands some details, some will not. Keep also in mind that these notations are usually designed to build low level specifications. BI Requirements must stay at the higher level. Last but not least requirements will be validated by stakeholders do you suppose that your CxO is understanding the E/R notation?

A draw is more explicit than a paragraph” is a sentence that I valuate. But I usually add that “a draw without explanations has no value”. If you put some schemas or drawing be sure that they don’t provide information that are not explicitly written in the paragraphs surrounding them. As a guideline, keep in mind that I’ll not read the schemas and draws. They will just be used to double-check my own understanding. Do the same!

The first section of your document should be a large introduction to your business. We’re dealing with BI, we can’t be expert of each process of the company so you need to provide us enough information about the processes without having to read pages of details. The first section deals with this idea: a description of the business process to be analyzed with the BI solution.

This section of the document should explain the Business Process that you want to analyze through Business Intelligence. I just put the stress, I’m not interested, at this step, to know “how you’ll analyze” but I’m focused on “What will be analyzed”. The existing Business Process generating data that we’ll transform to provide some information that will be analyzed.

If you want to analyze the relation between the customers’ budget and the sales then I’m expecting that you describe “how customers are buying things with your website”, “how customers are buying in your shops” and “how we get information about their budget”. In other words, what are the different activities of the Business processes “Internet Sales” and “Shop Sales” including especially but not exclusively the activities “CRM” and “Order”.

This is really a Business Processes description that I’m expecting and not an application process. At this moment, information about software supporting the business processes are more a distraction than really helpful.

This part of the document is critical, the more correct and exhaustive this part will be, the faster we’ll be able to start the development.

Define the terms

Vocabulary or semantic is something essential in Data Modelling and so Business Intelligence. You need to provide a definition of terms that are used in the Business Processes described previously.

This vocabulary will be also reused in the resulting BI Solution. So if you’re using the word “Customer” in your requirement we’ll use “Customer” and do not expect “Client” in the implementation.

Avoid usage of synonyms. It’s probably a better writing style to make usage of them, but it’s also ambiguous. If a “Customer” and a “Client” have the same meanings, choose one of them and use it until the end of the document. I’m not expecting a Goncourd or Nobel price of literature but clear business requirements.

Be explicit about your definition. I don’t expect and are not interested by a list of items with two words standing for a definition. Give us a context and rules to apply to validate your vocabulary.

Example: Define “customer” for a hotel chain

Possibilities include:

  • A customer is someone who makes a reservation
  • A customer is someone who has had at least one stay at a hotel
  • A customer may be a business or individual who rents a hotel facility
  • A customer is someone who signs up for the hotel loyalty program

Candidate definition:

A customer is an individual, group, or business who has received one or more hotel services, including hotel stays, facility rentals, and/or other additional hotel services, such as spa services. Hotel services may or may not have been paid for by the customer. A customer does not have to complete a hotel stay to be considered a customer.”

It’s usually easier to understand and assimilate if the definitions are given in-text with the business process description. I do not recommend a dictionary or glossary disconnected of the business process description. It’s usually difficult to maintain and difficult to read.

Often, I’m receiving documents where the semantic in the drawings and the text is not the aligned. It’s really confusing. Even if you’re copy-pasting schemas from other documents ensure that the semantic is the same than in other parts of your document. If, for editorial reasons, you can’t update the text on the drawings then clearly state that the terminology is not identical and give clear correspondences.

Analytics targets

In the second section of the document, we’re dealing with the analytics that should be conducted on top of the previously described Business Processes.

The topic is not only what we want to analyze but also about the skills of the users. The more-skilled the users are, the more degree of freedom we’ll offer. But if your users have absolutely no IT knowledge and no time, we’ll probably deliver a report in their mailbox.

 Users: skills, mobility and audience

Typically, users will fall into one of three categories in relation to the timeliness of data required:

  • Report broadcasts and scheduling: Many BI users won’t require real-time data access. For example, the marketing department might rely on weekly sales reports to gauge the effectiveness of brand strategies and specific marketing campaigns over a significant period of time. Daily, or real-time reports – such as transaction level reports – would provide too much detail. The strength of BI in this instance is in its ability to track and identifying trends over time, while simultaneously allowing users to connect shifts in the data to real-life events (marketing campaigns) to support strategy development.
  • Alerts: In some instances, certain user groups won’t need to actively monitor business activities, but just the ability to be alerted to abnormal or unacceptable circumstances – when key metrics fall outside predefined parameters. For example, the accounts payable team don’t need constant data access to perform their daily tasks. However, if an invoice is set to pass its payment deadline, an alert can grab attention to ensure that it’s processed in a timely manner.
  • On-demand data access: Some users will require the ability to directly access and search real-time, or near real-time, operational data and reports, as well as perform ad-hoc queries. For example, BI users coordinating the intersection of ‘just in time’ manufacturing and supply chain operations would demand the ability to access, analyze and act on up-to-the-minute reports and data to ensure the timely delivery of stock from the warehouse, to transport operators, and to retail outlets.

Be sure to give a correct overview in terms of skills, mobility and point-in-time of the users that will use the solution. It’s expected that you’ve more than one type of users. Different categories of users will access different artefacts of the solution.

Analytics needs

Articulate your reflection in terms of question that will be asked. Don’t stop on the first question. If users tell you that they want to “Monitor Quality”, go further ask them to explain how they will define “Quality”. Is the difference between two measures in percentage or absolute value? Ask them what they will effectively monitoring? Will they check if the quality improves? Do they want to receive alerts when the quality decrease? Is just one measure bellow the expectation enough to ring the bell of the alerts?

It happens that users have no idea for these questions. They just want the raw data. In such case, they also ask to be able to define a lot of parameters by themselves. It’s a current and normal situation. It probably means that you need to add a phase of data discovery in your business intelligence path. Self-Service BI is in the BI Toolset to provide some data discovery features to end-users.

You don’t need to be especially exhaustive in the definition of the analytics needs. BI models have the ability to always answer to a family of question and not a unique question. So if you’ve forgotten an analysis in your requirements, you’ve a probability that you’ll still be able to answer this question with existing models. But it’s only a probability, other families of analytic needs will require different models. So it’s probably better to describe non-similar needs.

Business Impact

The last part of the document is optional. We can build a solution without it. Anyway, I’ll still be interested and will surely read it in detail. With this section, I’ll probably provide a better solution and share your business objectives.

This part of the document is more important for the stakeholders to have some support when requesting a budget. It explains what are the expected return on investment and especially how do you think they will be achieved. What are the changes in the business processes that you’ll support with this new application? How these information needs map to the business goals?

Explain me your problem, I’ll find a solution.

A recurrent question that I received after a conferences or during a course sounds really simple but is rather complex to answer: How do you write specifications for Business Intelligence solutions? That will be the starting point of this series of four blog posts about specifications, requirements, work-items and test-cases.

A key point is what do you mean by specifications? I tend to agree with this definition from Business Dictionary:

Exact statement of the particular needs to be satisfied, or essential characteristics that a customer requires (in a good, material, method, process, service, system, or work) and which a vendor must deliver. Specifications are written usually in a manner that enables both parties (and/or an independent certifier) to measure the degree of conformance. They are, however, not the same as control limits (which allow fluctuations within a range), and conformance to them does not necessarily mean quality (which is a predictable degree of dependability and uniformity). […]

In a few words, specifications let you measure conformance not quality.

IMHO, they are pointless. Do you really want to open a debate with your customer or stakeholder by the following sentence: “Yes, we’re delivering bullshit but it’s 100% conform to the specifications!”. Personnaly, not. And it’s independent about who is writing the specifications or if they have been validated or not. Who cares? Seriously, will you accept specifications from your plumber? Are you sure to understand 10% of his technical wording? You don’t care about all the documents, you want a good solution. A better way to collaborate: you explain your problem to him and he’ll find and implement a good solution.

You’ve two kind of issues with specifications:

  • They don’t describe the problem but a potential solution. These requirements don’t let some latitude to the business intelligence specialists to design the best models and visualizations. Generally the resulting development is not performant and sub-optimal and overly complex, leading to a solution not accepted by end-users.
  • They do not provide enough specificity to guide development of the BI databases and applications that deliver the BI or to guide the business process changes that deliver the bang for the buck to the business.

For the second kind of specifications, it’s usually easy to explain to your stakeholders that you can’t come to a good solution with generic sentences. But it’s more tricky with the first kind of specifications explained above. They will tell you that the solution is already explained and you just have to implement. From my experience, don’t accept this kind of mission. Explain them that we’re BI professionals, it’s up to us to understand the needs of our customers and to deliver solutions with a high quality. It’s not their responsibility to explain how they would like to solve their own issue.

So if I don’t like specifications, what am I recommending to write? Requirements, Work items and test-cases. But that’s for a few other blog posts starting by this one.

Tracking security changes on your SQL Server development environment

A few months ago, I was fed up to receive complains from my teammates about security settings changed on the development (or test) server. Every week, someone was at my desk complaining with a variation of “My development was working previous week and now it’s not working anymore. Someone (not me) must have change something about security but I don’t know who and why!”.

To give you a correct picture about our SQL Server environment, we’ve one instance for the dev server and another the test server but many people are developing on a given project. Consider also that we’re developing different projects at the given moment. All team members are (full) administrators of the dev and test servers. All team members know that the rule about security is “reduce privileges to the minimum required” (definitively db_owner is not a solution) and this rule will be checked during code reviews.

Some will consider a classical approach, supported by most DBA: we should restrict the administrator rights to less people: a kind of single-point-of-security. Well, it was probably my opinion seven years ago but my position has dramatically changed on this. This point of view is really based on my own environment … your mileage will vary but keep in mind that I’m speaking at the dev and test level not production. Main point for this position is that I’d not recommend a “single-point-of” for any agile development. The true names are not “single-point-of” but “bottleneck”, “not-shared-decisions”, “I’m-the-only-one-who-know-don’t-fire-me” and “not-my-business-ask-this-guy” … four things that I’m trying to avoid as much as possible.

I implemented a solution to track any change to “security” on the SQL Server. Not a word to my team members … I was the only one to know that something has been scripted and deployed. After a few weeks, my conclusion was clear: nobody in the team is trying to sabotage the work of others. After investigations, issues could be grouped as

  • You changed the code and it’s now requiring new privileges: Script them and stop blaming others.
  • During development, you manually added some privileges to the database. But the deployment script is not including all these privileges. Someone has redeployed the solution from scratch, discarding all the unscripted privileges. It’s your mistake.
  • Less frequent … indeed, someone on another solution is also developing on this shared component (SSISDB) and has a different definition than yours for this shared role. Discuss together and align your needs.

So clearly it was not a question of responsibility or trust but a problem of practices:

  • All privileges must be scripted (and reason to request them documented) and committed in the solution. It must be part of the deployment artefact (at least in dev/test).
  • All the work performed on SSISDB (or any shared component) must be executed on a shared project

That’s not enough, practices not supported by tools are pointless and will never be applied during more than a month. We must provide a tool for people to know who has changed “security”. The following trace is doing the job:

use [master];
go

create server audit
    [Audit Login Changes]
to file
(   FILEPATH = N'N:\Trace\'
    , MAXSIZE = 1024 MB
    , MAX_FILES = 10
    , RESERVE_DISK_SPACE = OFF
)
with
(   QUEUE_DELAY = 1000
    , ON_FAILURE = CONTINUE
    , AUDIT_GUID = '33e055a9-1924-47c3-9798-e8875a5bcd42'
);

alter server audit
    [Audit Login Changes]
with (state=on);
go

create server audit specification
    [Audit Login]
for server audit [Audit Login Changes]
    add (DATABASE_ROLE_MEMBER_CHANGE_GROUP)
    , add (SERVER_ROLE_MEMBER_CHANGE_GROUP)
    , add (DATABASE_PERMISSION_CHANGE_GROUP)
    , add (SERVER_OBJECT_PERMISSION_CHANGE_GROUP)
    , add (SERVER_PERMISSION_CHANGE_GROUP)
    , add (DATABASE_PRINCIPAL_CHANGE_GROUP)
    , add (SERVER_PRINCIPAL_CHANGE_GROUP)
    , add (SCHEMA_OBJECT_PERMISSION_CHANGE_GROUP)
with (state=on);
go

and the following query will help you to investigate the traces. Since this instant, I just received a few complains and it was about Windows (disk access for read/write files with SSIS) issues.

declare  @LogFilePath varchar(255);

select
	@LogFilePath=log_file_path
from
	sys.server_file_audits
where
	name='Audit Login Changes';

set @LogFilePath=@LogFilePath +'*.sqlaudit';

select
	session_server_principal_name
	,database_name
	,object_name
	,m.class_type_desc
	,statement
	,convert(datetime,
		switchoffset(
			convert(datetimeoffset, event_time)
			, datename(TzOffset, sysdatetimeoffset())
		)
	)
from
	sys.fn_get_audit_file (@LogFilePath,default,default) f
inner join
	sys.dm_audit_class_type_map m
	on m.class_type=f.class_type
order by
	event_time;

I know that if someone wanted to hijack this trace it would be easy to stop it (and restart it), but keep in mind that I’m not trying to fix a security issue but to fix a process issue during the development.