February 12, 2021 – Debbies Microsoft Power BI, SQL Fabric and Azure Blog

Power BI Loves a Star Schema and It loves single direction joins, one to many from your dimension (Description) tables to your central fact (Business metrics) table

The reason for the single direction filter join? You want to be able to filter your measure by descriptors. Sales by Product category. Quantity Sold by Business Unit.

There isnt a need to filter a descriptor by a measure. it cant happen.

This is nice simple logic which works really well. However, there are some use cases where this may not apply. Lets look at our Activities model to get more of an idea of this in action.

Activities Model

The Date table is used in lots of other models and has dates from 1990 to 2035 in it.
This model only has 10 years of data and its important that the latest date for all the DAX and Filters is not the current date but the last date in the data
This model is at year level and the last year in the data set is Currently 2019 (We are now in 2021)

the first thing to do is to test that the flags are working

Year from the date table
Flags from the flag table

And this works fine. There are only flags to 2003 to 2019 so all the years not attached to these are omitted and its all fine.

Then Activities measure from the fact table is added and it all seems to go wrong (Keys have been added to show the issue)

Note that Activities are 0

The Key (dateKey from dim to fact, FlagKey dim to fact) are essentially cross joining. There should only be one year.

If you Filter the Table so Activities are greater than 0 everything is resolved. The Fact is being filtered.

What is happening?

When the measure comes back as 0 or Null, we are not filtering the fact table

So we want to filter year by the flag but there is only a single direction filter to the factor table. Without the filter on the fact table, its almost like there is now a block in the middle, not allowing other filters to work.

You only see this because:

a. You want to see metrics that are null or 0

b. If the flag was in the date table rather than a separate dimension then there wouldn’t be an issue. Because its in another dimension it cant filter the year.

For this specific issue, the joins between date and the fact and the flag dimension and the fact has been changed from Single Direction to cross join.

Now the flag table can filter the year if we aren’t filtering the measure.

Why are cross joins not recommended in power BI?

There is always a performance issue when you do the bi directional join so you should only do this if you have to.

Bi Directional Filters are one of the top Power BI performance Killers because Power BI has to work much harder with that join

Other Ways to Resolve the issue

Resolve in SQL

The Flags are specific to the report date NOT the current date. But, if the Date dimension was created in SQL Pre Power BI, these flags could have been added in the Date table. Then simply imported into Power BI within the Date dimension

Resolve in Power Query Editor in Power BI

With the Date Dimension and the flag dimension still intact in Power BI.
In Power Query Editor, Merge Flags into date,
Add in the Flag for the last 10 years into the date table.
Then Filter the date table so there is only 10 years worth of dates.
This would ensure you didn’t need to cross filter join so long as the criteria is that you want to see 10 years of data

In Azure Data Factory Moving from development and Production We looked at how we can use Azure DevOps to move the Json Code for Development Data Factory from development to Production.

Its going well, I have however been left with an issue. every time I move into Production details for the Linked Services have to be re added. Lets have a look at the SQL Server and the Data Lake gen 2 account.

Development

Notice that the information has been entered manually including the Storage account Key.

Again, in this instance the information has been entered manually. SQL Server Authentication is being used because we have a user in the SQL DB with all the privileges that Data Factory Needs.

DevOps Data Factory release Pipeline

Go into Edit of the Release Pipeline

Within Prod Stage we have an Agent Process

We are looking for the Section Override Template Parameters

-factoryName “prd-uks-Project-adf”
-AzureDataLakeStorageGen2_LS_accountKey “”
-AzureSqlDatabaseTPRS_LS_connectionString “”
-AzureDataLakeStorageGen2_LS_properties_typeProperties_url “https://prduksProjectsa.dfs.core.windows.net/”
-AzureKeyVault1_properties_typeProperties_baseUrl “https://prd-uks-Project-kv.vault.azure.net/”

Note that currently Account Key and SQL Database Connection String are null.

Provisioning Azure Key vault to hold the Secrets

Managed Identity for Data Factory

Copy your Azure Data Factory Name from Data Factory in Azure

You need to have a Key vault set up in Development

GET and LIST allows Data Factory to get information from the Key Vault for secrets

Paste the data factory name into Select Principal

Key Vault, create a Secret for the Azure Data Lake Storage

For the Key Vault Secret. I gave it the Secret value by copying across the Access Key from the Azure Storage Account Keys Section

The Content type was simply set as the name of the Storage Account for this excercise

In Data Factory Create a Linked Service to the Key Vault

Test and ensure it successfully connects

Use the New Key Vault to reset the data Lake Linked Service

How does this Data Lake Linked Service change the DevOps Release Pipeline?

Its time to release our new Data factory settings into Production. Make sure you have Published Data Factory into Devops Git.

Production Key vault Updates

We need to update Production in the same way as Development

In Production Key vault add the Production data factory name to Access Policies (as an Application) With Get and List on the Secret
Ensure that there is a Secret for the Production Data Lake Key AzureDataLakeStorageGen2_LS_accountKey
Check your Key vault connection works in Production before the next step

Azure DevOps Repos

In Azure Devops go to your Data Factory Repos

Notice that your Linked Service information for the Data Lake now mentions the Key Vault Secret. its not hardcoded anymore which is exactly what we want

Azure DevOps Release Pipeline

Go to Edit in the Data Factory release pipeline

When the job in Prod is clicked on, you can go to the Override Parameters Section. And notice there is now an error

AzureKeyVault1_properties_typeProperties_baseUrl is the missing Parameter. Basically at this point you need to delete the code in the Override template Parameters box and then click the button to regenerate the new parameters

Override with production information (I saved the code so I could re copy the bits I need.

Once done, notice that the -AzureDataLakeStorageGen2_LS_accountKey “” parameter is now gone because its being handled by the key vault.

Lets Save and Create a Release

New failures in the Release

2021-02-08T13:45:13.7321486Z ##[error]ResourceNotFound: The Resource ‘Microsoft.DataFactory/factories/prod-uks-Project-adf’ under resource group ‘prd-uks-Project-rg’ was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix

Make sure that your override parameters are ok. I updated:

Data Factory name from Data Factory
Primary endpoint data Lake Storage from Properties
Vault URI from Key vault Properties

Repeat the Process for SQL Database

With everything in place we need to introduce a connection string into Key Vault

I have a user set up in my SQL database. the user has GRANT SELECT, INSERT, UPDATE, DELETE, EXEC, ALTER on all Schemas

I want to include the user name and password in the Connection string and use SQL authentication

Secret Name

AzureSQLDatabase-project-ConnectionString

Secret

 Server=tcp:dev-uks-project-sql.database.windows.net,1433; Database= dev-uks-project-sqldb; User Id= projectDBowner;Password= Password1;

the connection string has been set as above. For more information on connection strings see SQL Server connection Strings

Go back to Data factory and set up the new secret for SQL Server

This is successful

Data Factory and DevOps

back in Data Factory Publish the new linked Service Code
go into Dev Repos and check in Linked Service code you are happy with the new Key vault information
Go to Prod Key vault and make sure you are the Secret is set with the Connection String for SQL DB
Test the Key vault secret works in Prod
Back in DevOps Go to Release pipelines and Edit for the adf Release CD pipeline (Release Pipelines are Continuous Delivery. Build pipelines are CI for Continuous Integration)
Edit Prod Stage (I only have Prod) Arm Template Deployment Task, Copy Overwrite Template Parameters code into a file for later
Delete the code and click the … to get the latest parameter information
Re add your production parameters, most can be taken from the code you just copied.
Create a new Release
go to Linked Services in Data Factory and check they are still Production. They still use Key vault and they still work

Now this is all in place, Development Data factory can be published up to production. there is no need to reset Linked Services and all your information about Keys and passwords are hidden in the Key Vault

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28

Day: February 12, 2021

Power BI when to (Possibly) use a Bi Directional Filter in your data model