Debbies Microsoft Power BI, SQL Fabric and Azure Blog

January 2, 2020

Creating an open Account Measure using DAX

Thank goodness for the amazing people on the Power BI Forums helping with this. I have the solution but what I wanted to do was to create a post to explain the why

The Problem

I have accounts and events occurring on each account. What I want to know is the time period they are open

Just for this example I have taken One account with 11 activities

If you run the visual against start date all you get is a measure againt the start time period.

We are more interested in creating a measure to show how long the activity went on for.

For this example I am going to choose Jan 28 2017 to work with to continue because the end date for this is March 21st 2017 so this activity should span 3 months

The Date Dimension

To do this you must have a date dimension. For the purposes of this example I’m going to create one in DAX (Create table) Using the following:

Notice that I have used Start and End Dates to set the min and Max years. You should already have a date dimension to work with though.

Set up the relationship between the Fact table and Date dim

Start date is the Active join and End Date is the Inactive Join

You could create a role playing dimension for the End date and have that as an active join but in this case I’m going to go for the above design.

Account Start flag (New Measure)

CALCULATE – You can modify your expression using a filter
ALLSELECTED – If you have a slicer set to Year (Date) , ALLSELECTED takes the ‘Context’ from the slicer Like 2018.
Date is less than or equal to the Max Date in the Filter Context

Account End Flag (New Measure)

ENDOFMONTH – Returns the last date of the month in the current context for the specified column of dates.
The date at the end of the month is less than the Max date in the Filter Context
USERELATIONSHIP Because End Date is the inactive relationship we need to specify that this relationship is used, Not Start Date

Add an Open Measure

To understand the results we need to try and understand how the new measures work

Jan

Start Flag = 1 because 18^th Jan is less than Jan 31st
End Flag = NULL because March 21^st (Reset to March 31^st) is NOT Less than Jan 31^st
Open = 1- null = 1

Feb

Start Flag = 1 because 18^th Jan is less than Feb 28^th
End Flag = NULL because March 21^st (Reset to March 31^st) is NOT Less than Feb 28th
Open = 1- null = 1

Mar

Start Flag = 1 because 18^th Jan is less than Mar 31st
End Flag = NULL because March 21^st (Reset to March 31^st) is NOT Less than March 31st
Open = 1- null = 1

Apr

Start Flag = 1 because 18^th Jan is less than April 30th
End Flag = 1 because March 21^st (Reset to March 31^st) is Less than April 31st
Open = 1- 1 = 0

There are usually lots of ways to answer one question so this is one of many but the Open Flag now allows me to pull up all my activities and show in a Matrix where the are open.

December 4, 2019

Use Data Lake Storage V2 as Dataflow Storage

This blog post follows on from https://debbiesmspowerbiazureblog.home.blog/2019/11/28/setting-up-an-azure-data-lake-v2-to-use-with-power-bi-data-flows-in-service-as-a-data-source/

Dataflows are essentially an online collection and storage tool. Power Query Connects to data at source and collects and transforms that data. The dataflow then stores the data in a table within the cloud. They are stored in Data Lakes which is all automated for you.

Dataflows unify data from all your different sources. It should be noted that a Data Warehouse is still the recommended architecture with Data Flows over the top.

Dataflows also introduce the concept of the Common Data Service (CDS) and the Common Data Model (CDM). CDM allows organisations to use data formats to provide consistency across deployments. Now we have Azure Data Lake gen2 storage can be combined with data flows to store the data flows to provide and structured centralised data source.

Thanks to https://docs.microsoft.com/en-us/common-data-model/use for helping me understand the differences between the two

When you integrate CDM with Data Lake Gen 2 you get structural consistency and you can use CDM Folders in the lake that contain your schemas in standard CDM format.

Dataflow definitions and data are stored in Model.json format. If you have a Model.json file, it shows that you are compliant with CDM.

Dataflows store their definition and data in CDM folders, in the Model.json formats. If you have Model.Json it shows you are adhering to CDM.

Of course, this can be quite difficult when you are working with data that does not adhere to CDM format. I can see it being really useful when you are starting from scratch but I have done some work looking at my data sources and they are always quite far from CDM format.

You can find information about the CDM here https://docs.microsoft.com/en-us/common-data-model/

And more information about CDS here

https://docs.microsoft.com/en-us/powerapps/maker/common-data-service/data-platform-intro

Advantages to setting up your own Data Lake Gen2 Store

Essentially, you can get by using the internal Data Lake Store but there are lots of reasons (And some of these advantages will be coming in future releases) why setting your own store up is a good thing.

Great for Re-use, If you are given access to the workspace you can use the dataflows already created for you.
Centralise your prepped data

Remember this doesn’t replace your data warehouse. It just adds a preparation and transformation layer above the data warehouse. Instead of having to wait to get your small change added to the warehouse you can add it to this layer.

Data within your data lake can be integrated into other solutions by developers
Data Lake Gen 2 is enormously scale-able for data
Dataflow data and definition file can be leveraged by developers for AI Services

Steps to Set Azure Data Lake Gen2 as Dataflow Storage

The storage account must be created in the same Azure Active Directory tenant as the Power BI tenant.
The storage account must be created in the same region as your Power BI tenant. To determine where you Power BI tenant is located
The storage account must have the Hierarchical Name Space feature enabled.
Power BI service must be granted a Reader role on the storage account.
A File system named powerbi must be created.
Power BI services must be authorized to the powerbi filesystem you create.

Once configured, it cant be changed. You cant go back to the default data store for your dataflows

Set Storage Account in Same Region as your Power BI tenant

Log into Power BI service , then Help ? and About Power BI

The Data is Stored in North Europe Ireland

When you set up your Data Lake Ensure North Europe region is selected

Set up Azure Data Lake V2 in Azure Portal

When you go into Azure Currently and look for Data Lake V2 you can only Find Gen 1

So the question is, how do you set up Gen 2 in Azure Portal? (Currently we are on the 25th November 2019. Hopefully this may get easier in the future)

First of all I go to the Subscription where I want to add the new data Lake v2

Open up the Portal menu (Now hidden to the left of the screen)

Choose Create a resource

next Choose Storage and Storage Account

Note that the Account kind is Storage V2 (General Purpose)

Make sure Location is the same as Power BI Service. I’m not using this functionality at the moment but there is no harm in applying this logic.

It is recommended to set replication setting to Read-access geo-redundant storage (RA-GRS)

For the time being, I am leaving everything else as standard

Next go to Advanced

the most important setting here is the Data Lake Storage Gen 2 . Enable the Hierarchical namespace and your storage account will now be created as data Lake Storage V2

Click Review and Create

Grant Reader Role to Power BI Service

This is all built in so it should be fairly straightforward.

In Azure go to your new storage account (If you aren’t already in it) and go to Add role Assignment

One there, choose the reader role and select Power BI Service which as you can see is already in the list.

It takes about 30 minutes for this to take effect.

Create a powerbi file System

Power BI Needs to use a Filesystem specifically named powerbi so data flows can be stored in this specific file system.

We now have a few options available to us. I have some files to add so I am going to add them to a container

Click on Containers And then + File System

Note that to store dataflows its called powerbi Click OK

clicking on your new storage account(s) you are told to Download Azure Storage Explorer.

I already have this Azure Storage Explorer downloaded. If you don’t have this, its something you will absolutely need to work with Azure Storage accounts.

Once downloaded Open Azure Storage Explorer

You will need to Add in your Azure Storage Accounts by clicking the little connector icon

You will be asked to sign into your Account with your Office 365 credentials and 2fa authentication

This will log you into all your Subscriptions and Services

You are good to go

Here you find your subscription, Then go to the data Lake Storage Gen 2 and find the new File system powerbi.

Grant Power BI permissions to the file system

Before we connect we need to grant permission for Power BI to use the File System (Again this is specific to using DLV2 as a data flow store but at this point we may as well set up the permissions)

go to Azure Portal and Azure Active Directory

then select Enterprise Applications

Change the Application type Drop down to All Applications

Power Query Online and Power BI Premium and Power BI Service are in the list.

You will need the Object IDs of these applications.

Back to Azure Storage Explorer (Manage Access)

Navigate back to powerbi file system, Right click and Choose Manage Access

Click Add, Grab the object ID of Power BI Service to Manage Access

Set Read, Write and Execute Access to Service and Repeat the Process for Power BI Premium

Repeat for Power Query Online but Set Write and Execute Access

Other Also needs setting up as follows

Connect the datalake Gen Storage Account to Power BI Dataflows

To do this you need to be Power BI Admin. go to Power BI Service and navigate to the Admin Portal

From here Connect your Azure Data Lake Storage Gen2.

Add your Subscription ID, Resource group Name and Storage Account name of your Data Lake

It is now connected to Power BI

Allow Admins to Assign Workspaces

Finally, still in Admin Portal, go to dataflow Settings

Switch Allow Workspace admins to assign workspaces to this storage account to On

Workspace admins can now assign workflows to the filesystem created

Things to Consider

Here is where it starts to get a little bit hazy

This is all very much still in preview and there will be lots of updates coming
Once your dataflow storage location is configured it Cant be changed so dont do this on a whim.
You have to be an owner of the dataflow or be authorised to the CDM folder in the data lake to use the data flow
Once you have created a dataflow you ant change the storage location
It is your organisations data flow so there can only be one.

because of this, and the fact that its still in development I am going to wait to set up a central storage account for our workflows.

Im still unsure what you would do with Workflows that are already set up, Do they stay in the default area or can you reprocess them into the central data lake.

What happens if you want to move the data lake to a new subscription? is it not possible?

I will be going back to this when I have a few more answers to these questions

November 28, 2019December 4, 2019

Setting up an Azure Data Lake V2 to use with power BI dataflows in Service (As a data source)

Previous to the brand new Azure Data Lake, I was adding all the files into Blob Storage. However Azure Data Lake V2 is built on Blob storage and DataLake V1

its built for big data and a fundamental change is that we now have a hierarchical namespace. This organises your files into directories.

So now, we can do things like use all files from a specific Directory, delete all files from a specific directory. We can categorise our files within the data lake.

Set up Azure Data Lake V2 in Azure Portal

When you go into Azure Currently and look for Data Lake V2 you can only Find Gen 1

So the question is, how do you set up Gen 2 in Azure Portal? (Currently we are on the 25th November 2019. Hopefully this may get easier in the future)

First of all I go to the Subscription where I want to add the new data Lake v2

Open up the Portal menu (Now hidden to the left of the screen)

Choose Create a resource

next Choose Storage and Storage Account

Note that the Account kind is Storage V2 (General Purpose)

Ive set the Location to North Europe, Simply because I know thats where our Power BI Data in Services is stored and I may as well stick with this.

For the time being, I am leaving everything else as standard

Next go to Advanced

the most important setting here is the Data Lake Storage Gen 2 . Enable the Hierarchical namespace and your storage account will now be created as data Lake Storage V2

Click Review and Create

Create a file System within a Container

We now have a few options available to us. I have some files to add so I am going to add them to a container

Click on Containers And then + File System

Click OK

clicking on your new storage account(s) you are told to Download Azure Storage Explorer.

I already have this Azure Storage Explorer downloaded. If you don’t have this, its something you will absolutely need to work with Azure Storage accounts.

Once downloaded Open Azure Storage Explorer

You will need to Add in your Azure Storage Accounts by clicking the little connector icon

You will be asked to sign into your Account with your Office 365 credentials and 2fa authentication

This will log you into all your Subscriptions and Services

You are good to go

Here you find your subscription, Then go to the Data Lake Storage Gen 2 and find the new File system.

I have added a folder here called Workshop1Files to my File System

Obviously Data Lake Storage gives you so many ways of working with files and automating the files to the storage area. In this case I am going to simply move a file into my new folder to work with

Double click on the folder and then Click Upload and Upload Files

And now your file is in the cloud, in an Azure Data Lake ready to use.

Connect to your Azure File with Power BI Desktop

The first test is can we access this data within Power BI Desktop.

Open Power BI Desktop and Get Data

Choose Azure Data Lake Storage Gen2 (Currently in Beta)

Add the URL

Data Lake Storage Gen2 have the following pattern https://<accountname>.dfs.core.windows.net/<filesystemname>/<subfolder>

If you go to Right click on the file in Storage Explorer and go to properties, there is a difference in structure

http://<accountname>.blob.core.windows.net/<filesystemname>/<subfolder>

If you try to connect with the original URL from Data Storage you get the following error

And if you change the URL from blob to dfs

There is a missing part to the puzzle. Go back to the Azure Data Lake Storage Account in Azure and Add Storage Blob Data Reader to your account

Then try again and hopefully you are in .

No need to combine because we have specified the file.

There are different ways you can load the file. I loaded one file but you can load all files in the File System

https://storageaccount.dfs.core.windows.net/filesystemname

or all files under a directory in the file system (You can include sub directories in this)

https://storageaccount.dfs.core.windows.net/filesystemname/directoryname/directoryname

Connect to your Azure File with Power BI Data Flow

I am creating data flows in the power BI Service to ensure they can be reused across the company. The question is, Can I Connect to the above File in Service via a data flow

In Power BI Service, add a Data Flow which takes you into Power BI Query Editor in the Service. I already had some data flows connected to an Azure database.

The data is in Azure Data Lake Storage so the first think I do is try the Azure route

However, there is no Azure Data Lake Storage Gen 2. This must be something coming in the future. so then I go to File and click on Get Data text / csv

You will need to add the File Path and your Credentials (As per previous advice use dfs not blob in the URL. this seems a little flaky at the moment. I choose Organisational Account first before adding the URL and then it seems to work.

Remember Go back to Azure Storage Explorer. if you click on properties, you can grab the URL from here

We don’t need a Gateway Setting up because everything is now in the cloud.

Clicking next, Nothing happens, it just keeps bouncing back to the same window.

Attempting to use the Blob Storage connector also doesn’t work (Using the Azure Account Key as authentication).

It would appear that currently I have hit a brick wall and there is no current DLGen2 connector for Data Flows.

I will be keeping an eye open on this because obviously, when you are pushing the new generation of Data Lakes and Data Flows then there needs to be a DLGen2 Connector for Data Flows.

Update

Had a reply back on the Power BI Forum (Not a good one)

The feature haven’t been planed. If there is any new message, the document: What’s new and planned for Common Data Model and data integration will be updated.

I have found this in Ideas

https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/38930032-add-azure-data-lake-storage-gen2-as-a-data-sourc

Please help us get this working by voting for this idea.

November 19, 2019

Power BI November 2019 updates

To look at the new power BI Updates I went for my go to Data Set Adventureworks data warehouse

I have added the following tables

DimEmployee
Dimdate
DimSalesTerritory
DimProduct
DimProductSubcategory
DimProductcategory
Dimcustomer
FactInternetSales

Use the Updated Ribbon

Once I have added the data I want to start using the new power BI Ribbon in preview. go into Options Preview Features and select the Updated Ribbon feature

You have to restart to get the new ribbon

It looks quite different. Lets see how it goes…..

Decomposition Tree

A brand new AI visual available in November is the Decision tree. I have used the Key influencers with social media data which is great so i am very excited about this new visual.

With the new visual, we can do route cause analysis, Look how individual categories contribute to a whole and we can also rank categories against the selected measure.

Again, this is a preview feature so it needs enabling in options

Just like the Key influencers visual, we need something to Analyse and data to explain the analysis by.

Unlike Key influencers where we analyse by category (e.g positive, negative and neutral) Here we are analysing a metric

Lets pick something to look at

class breaks down the sales amount

Next, i clicked on the + next to H and chose product category. We can now see that Bikes takes up the majority of the H category

Next try using the AI feature, Options are: High and Low value. I’m going to choose high value. It will look though all the categories that are left and pick the category with the highest contributor to that metric

The lightbulb against the title lets you know its picked by AI

lets do the same but look at lowest value

Its is picking out Central region as the very lowest of the explain by values

This is fantastic, You can also lock a level so a user cant remove it, change the colours etc. I cant wait to use this on my actual working data sets

Advanced Gauge

Straight away, the new ribbon changes how you go to the marketplace

Now you go to More Visuals and From Appsource to get to the marketplace

go for the Advanced Gauge from xViz

I have Year to date Internet Sales and Quantity of products bought

I also have measures for Last year Year to Date Sales and quantity so I can add these into the gauge.

There are other options like setting minimum and maximum values and tooltips

Hierarchical Filter

I have used the Hierarchical slicer before, lets see how the new Filter looks (get from Appsource – Hierarchical Filter by Eviz)

You need a hierarchy in your data to use this visual. I have the Group Hierarchy (Group Country and region) One of the best things about this visual is that, not only do you use it to slice your data but you can add a measure which gives you even more information right on the slicer its self.

However, I don’t think it comes across as a slicer, something that the user can interact with.

Financial reporting Matrix by Profitbase

A Matrix with financial formatting options

The video talks about Sticky rows and sticky columns in that you can set them to always be on screen even while scrolling which is good.

However the Information on the Overview video uses

There is a note on this one. Its really hard to delete, it took me a fair while the other day to click it in the right place to delete what is possibly a measure called DateCalculations in the columns section so this may be one to look at in more detail later.

Distribution Visual

More Visuals – From App Source

This is supposed to be great for setting up a global legend, however no matter what I tried, I couldn’t get the visual to display any information so this one is a fail for me

Conclusion

Some Pretty good updates this month, Im specifically excited about the Decomposition tree.

Another major one for me is the LinkedIn Sales Navigator connector. However I’m not sure I have the right set up to use it at the moment.

the video explains that you need Enterprise Plan in either admin or reporting user roles you can use the connector but what is not mentioned is what this is? Is this Linked In?

I will be looking at this in more detail hopefully at a later stage

November 15, 2019

Azure Analysis Services Security

I have recently been setting up training sessions for Analysis Services and Azure Security has been one of the biggest learning curves.

Lets have a look at what we needed to do with our Adventureworks demo and all the areas of Possibility

Azure Analysis Service Access Control (AIM)

I have created an analysis Service in Azure which makes me the Administrator. I can do everything, including add models that I create in Visual Studio 2019.

However I will be working with lots of other people. In the first instance they will be using the models I have set up to create reports in Power BI

This is only a training demo and is not how I would set up a fully functioning Live Analysis Service model

in Azure we can add role assignments (Click on Access Control (IAM) but what does this mean in terms of actually using the service?

I am an owner. Azure has a number of built in roles role-based access control (RBAC) to use

Owner – Has full access to all resources including the right to delegate access to others.
Contributor – Can create and manage all types of Azure resources but can’t grant access to others.
Reader – Can view existing Azure resources.
User Access Administrator – Lets you manage user access to Azure resources.

The model above shows how you can use RBAC to segregate duties.

Using the following information, I set my users up as contributors in the hope they would be able to access the models

With Contributor in place the users could see the Analysis Service in Power BI but didn’t have access to the model

As a test the users were bumped up to Owner and still couldn’t see the models in Power BI

Azure AIM is in relation to the Analysis Service, Not the models created on the Analysis Service

Visual Studio 2019 (Roles)

The next step is to see what happens when we create a role in Analysis Services model in Visual Studio

In tabular Explorer (Make sure you have downloaded the Analysis Services extension for Visual studio) click on Roles. I have created a training role for this demo

I gave all the Users the permission of Read and Process

Permissions	Description
None	Cant modify the model in SSMS. Cant query data.
Read	Cant modify the model in SSMS. Can query data.
Read and Process	Can query data. Can run process operations by running a script or package Cant make any changes to the database. Cant view the model in SSMS
Process	Cant query data. Can run process operations by running a script or package Cant make any changes to the database.
Administrator	can modify the model schema and query data

Once the training role was set , the model was reprocessed

Users now have access to the Analysis Services model and can use the model in Power BI

The question is, can the users access the model if they are in this Role, but they are not in Analysis Services (AIM)

Both users were taken out of AIM in Azure and still had access to use the models

New requirement for users to add new models to the Analysis services Database

the next stage was to allow the users to create their own adventureworks models in the Analysis Service.

With all the above security in place, when a user Processed their tabular model, the following error occurred

This user has owner Access in Azure IAM and Read and Process Access on the Analysis Services model

How do we allow users to actual create and modify models?

SSMS – Roles

SQL Server Management Studio is the place to grant model permissions

Open up SSMS as Administrator by Right Clicking and Run as Administrator.

Note that you can see the Training role set up in visual Studio. originally, Full control was unticked.

I ticked Full Control which is immediately set on the model. No need to process anything.

The user tried again and still couldn’t add the new model to the Analysis Service

Only server administrators, or database administrators having Full Control permissions, can deploy a cube from source files to a server, or create roles and assign members.

It would appear that Full Control (Administrator) is not the same as server administrator because you cant deploy a tabular model

Question, If I Re-process the visual Studio model with permission set as read and Process, will this override the new selection in SSMS?
Yes. If you reprocess your tabular model (Because you have updated the model, Added new measures etc) , Role permissions from here will override the Role permissions set in SSMS. Ensure that any Roles changed in SSMS are reflected in Visual Studio

SSMS – Analysis Services Properties

right click on the Analysis Service model (Under Connect) and Choose Properties

then select Security

Note that this is where Server Administrators are assigned and only these people can create and deploy their own models to the analysis Service

Going back to Azure, you can also set up Admins in here

Note that everything we just did in SSMS is available in Azure. Also, if we redeploy the Analysis Services Tabular model via Visual studio, it doesn’t do anything to Analysis Services admins

Our roles aren’t available to view in Azure . Just Analysis Services Admins

Everything is now working

There is still questions to be answered. Feel free to let me know your thoughts

When would you need to add users to Azure IAM if you can use the tabular models and also deploy models by adding users to Roles and the server administrator role in visual Studio and SSMS?

November 7, 2019November 7, 2019

Automate overnight and weekend pause and Resume in Azure Analysis Service

We have an Analysis Service on our Training Subscription and resource group and obviously its only being used within office hours as its only a training AS

the last thing we want is to have someone remembering to pause the service every night. As soon as you forget to do this you end up paying for the service when you don’t need it.

Clearly, this needs automating. There are lots of how to’s on line for this so lets see how easy this is to set up

What you need

Azure account
Azure Analysis Services Model

Create an Automation Account

You need an Azure Automation account to run an Azure runbook with powershell code

Search for Automation Accounts

Make sure Create Azure Run as Account is Yes

Set up Credentials

Runbook credentials are very similar to Azure SQL Server credentials. in your new Automation Account click on Credentials

Ive set up a new account for azure_trainingmantenance with the same user name and and a password that I have saved for later.

Connections

to see all your information like tenant ID and Subscription ID simply click on Connections

these fields can be added to the Powershell code later

Add Modules

All the Analysis Services methods (cmdlets) are in Powershell modules and will need to be installed if they are not already.

Go to modules and check AzureRM.Profile is available

If not then we need to add it. Go to Browse Modules Gallery

it takes a while for this to import. You need AzureRM.Profile for the next on AzureRM.AnalysisServices. repeat the process. In modules search. then go to Modules gallery, Search and Import

You now have the cmdlet for Analysis Services within the new Automation Account

Create a RunBook

click on Runbooks and add new RunBook

Now its time to add the Script to set up the Automation

I have used the code from GitHub https://github.com/DevScope/ssas-powershell-modules/blob/master/AzureAS.AutoScheduler.ps1

Paste the code into the RunBook Window

These are the following parameters to set

param( [string] $resourceGroupName = “RGName ” ,
[string] $serverName = ” awas “, (Note asazure://uksouth.asazure.windows.net/ from the full Analysis Services server path needs removing)
[string] $azureProfilePath = “”,
[string] $azureRunAsConnectionName = “AzureRunAsConnection”,
[string] $configStr = ”

You can get your Servername by going to Analysis Services in Azure and copying the Server name (This RunBook will pause and unpause this specific Analysis Service)

Another important part of the code is sorting out weekdays and weekends according to the SKU (Stock Keeping Unit)

Our training Analysis Service is in SKU S0 (Standard 40 query processing units) and we only need it running in the week

{

WeekDays:[1,2,3,4,5] ,StartTime: “”08:30:00″” ,

StopTime: “”17:59:59″” ,Sku: “”S0″”

}

Note, you could also use the code to scale your Analysis Services up and down to keep down costs. S1 has more processing units for peak hours and is more expensive. S0 with less Processing units could be for off peak hours. In this case you could set the code to not only set times to Pause and Un pause but to add and remove processing units

{

WeekDays:[1,2,3,4,5] ,StartTime: “”08:00:00″” ,

StopTime: “”17:59:59″” ,Sku: “”S1″”

} ,

{

WeekDays:[1,2,3,4,5] ,StartTime: “”18:00:00″” ,

StopTime: “”23:59:59″” ,Sku: “”S0″”

} ,

{

WeekDays:[6, 0] ,StartTime: “”08:00:00″” ,

StopTime: “”23:59:59″” ,Sku: “”S0″”

}

Test Runbook

Whilst in Edit mode you can also test your Runbook

If there are any issues, your test will flag these up. My first test brought up the fact that I had entered the wrong Resource group name

Go back to your Edit PowerShell Runbook using the breadcrumb trail (You don’t need to save the test)

Save and Publish your RunBook

Create a schedule

Azure Automation is enabled to schedule a run book one time or a reocurring schedule

I believe that in this instance we just need the scheduler to kick off our Runbook which contains all the code required to pause and restart the Analysis Service

Starts today

the schedule is now linked to the runBook

Lets see it in Action

the RunBook is now ready to go. Currently the training Analysis Services is Off, Once the Schedule kicks in will this turn everything back on again?

The Run book is Successful.

After 6 pm the Analysis Service is paused and when I get back into the office at 8.30 the server has restarted.

Success

November 4, 2019September 28, 2020

Incremental Processing in Data Factory using Watermark Table

I am pulling tweets into an Azure Table Storage area and then processing them into a Warehouse

The following shows the very basic Data factory set up

Connections

I have created a Linked Service for the Azure Storage Table PowerBIMentions

And another Linked Service for my Azure SQL Server Table PowerBIMentions

Datasets

the Storage Table was set up in Logic Apps. I wont go into detail about the schema here

I have created a SQL Table to Mirror the Azure Table Storage

Watermark Table in SQL

The Watermark Table is as follows

CREATE TABLE [staging].[watermarktable](
     [TableName] varchar NOT NULL,
     [WaterMarkDate] [datetime] NULL,
     [WaterMarkValue] varchar NULL,
     [WaterMarkOffSetDate] datetimeoffset NULL
 ) ON [PRIMARY]

For the time being, the Watermark value is to set the Date in the same format as is in the Azure Table storage.

Ignore the offest date for the time being

I have then created a stored Procedure to add the table name storage.PowerBIMentions, the maximum created Date from what is in the table and then some extra script to format that date into the correct format in Azure Table storage.

To understand how this should be formatted you can look at the Azure Table using Microsoft Azure Storage Explorer

https://azure.microsoft.com/en-gb/features/storage-explorer/

As you can see the format is YYYY-MM-DDHH:MM:SS.000Z

I haven’t included the code for the SP here but it is fairly straightforward. You take the max date from the table you are interested in and also include code for format the value to the correct value

Pipeline

Added a Pipeline called CopyPowerBIMentionsIncremental

The Source is the Storage mentions table

the Sink (destination) is the Azure SQL Table

And then we can easily import the schemas and map

Then run the very simple Pipeline which just moved the data. Then Runs the Watermark Stored procedure to recreate the MAX Date time

Change Pipeline to include a lookup activity

Now the next time we want to process we have to only process new records into the table.

the Query to select the correct date from the Watermark table

SELECT MAX(WatermarkValue) AS WatermarkValue From [staging].[watermarktable]
WHERE WatermarkValue IS NOT NULL
AND TableName = ‘staging.PowerBImentions’

You can test this in SQL Server Management Studio

CreatedAt gt ‘@{activity(‘LookupWatermarkOld’).output.firstRow.WaterMarkValue}’

CreatedAt is the column in the source Azure SQL Table table which has the following date format

Operator	URI expression
`Equal`	`eq`
`GreaterThan`	`gt`
`GreaterThanOrEqual`	`ge`
`LessThan`	`lt`
`LessThanOrEqual`	`le`
`NotEqual`	`ne`

Then it uses the Lookup activity and the WaterMarkValue from the SQL Table

Test the New Watermark Logic

in SSMS

SELECT COUNT(*) FROM [staging].[PowerBImentions]

there are 16 records and the max date is 2019-11-01T16:00:55.000Z

Run the new data Factory (Debug)

The new record count is 21

The new Max Date is 2019-11-01T17:01:52.000Z

And Lets to a quick check to make sure there are no duplicates

SELECT TweetText , COUNT() FROM [staging].[PowerBImentions] GROUP BY TweetText HAVING COUNT() > 1

its all working. We have no duplicates and this will ultimately help keep the processing down

There is another way to do incremental processing by using change tracking. Ill be looking at that in a later post

November 1, 2019November 1, 2019

Quick ‘Logic App’ Tip. Error on Foreach loop. An action failed. no dependent actions succeeded

I have been setting up a Logic App to add tweets to Azure table storage. While setting up a loop to add in Keywords into a table (1 tweet may add multiple rows to the Keywords table) I came across this error

the tweet has generated 10 rows of data (Because there are 10 keywords)

I have set up an increment counter to add to the ID so each one will get 1 2 3 and so on. If you toggle through the records you notice this

record Increment
1 1
2 6
3 3
4 7
5 2
6 4
7 10
8 8
9 9
10 5

And the proceeding table storage doesn’t work because of conflicts. clearly the increment isn’t happening in order. there is a quick solution to this one

go into Edit mode in the Logic App

every for each with an increment variable click … to get to Settings

make sure concurrency is set to on and that the degree of parellelism is one only. Setting to 1 will ensure the iterations are sequential

November 1, 2019November 4, 2019

Adding a Logic App to Refresh your Analysis Services Data

You have a shiny new Analysis Services model containing all the data you need for your Power BI reports and Dashboards

It has been added into an Analysis Service because A There is a lot of data, possibly too much for a Power BI Imports, B You want to do incremental processing without needing Power BI Premium and C You want to refresh more than 8 times a day.

Everything is all set up but the final part of the puzzle is how to schedule the refresh into the Analysis Services Tabular model.

Its easy enough in Power BI, You simply set the schedule in the Service. Lets have a look at how to do this using one of the Options. Logic Apps

Create a Service Principal (SPN)

First we need to Create a new Azure Active Directory Application and Service Principal to use with Role based Access Control. Whenever you have code that needs to access and or modify resources (The logic App will Refresh the Analysis Services data) You need to create an Identity for the App.

Sign into Azure

Go to Azure Active Directory

Next go to App Registrations and + New registration

The web URL is simply one set like the example in the documentation

Then click Register

Assign the Application to a role

Before you assign resources,, the app must be assigned a role. For this example we are going to assign a role at the subscription scope for our Proof of Concept Subscription

Go to All Services and then Subscriptions

Im going to select the Proof of Concept Subscription

Next select Access Control and add Role assignment

Note I have selected the new analysisservicesapp

Click save to add the Role Assignment

The Service Principal is set up. Now we need to get the values so we can sign into the app

Get Sign in Values

Go to Azure Active Directory

Go to App Registrations again and select the new application

Copy the Tenant ID for later use

Copy the Application ID for later use

Set up your Credentials

I was going to create a certificate but I found it an extremely complicated process and I couldn’t in the end export the created certificate in Powershell

Because I haven’t been able to Create and Export a certificate I am going to use Client Secrets instead

Click on New Client Secret

Once added you need to copy the ID for use later. You wont be able to access this information again so DONT LOSE IT

Give the Service Principal Name Authority to administer the Analysis Services

Still in your App click on API Permissions

Next View API Permissions

+ Add a permission and find Azure Analysis Services in APIs my organisation uses

Ensure that the Permission allows you to read and write all models (Tick the box)

then Add permissions

Note that although permissions have changed we still need admin consent

Grant Admin Consent

Granting admin consent requires you to sign in as global administrator, an application administrator, or a cloud application administrator.

From Azure Active Directory go to App registrations and select the app

Within the App go to App Permissions

grant Admin Consent

Configure Permissions in Azure Analysis Services

Next we need to ensure the new Server principal created as server administrator permissions in Analysis Services

Open SQL Server Management Studio by Right clicking and Run as Administrator, Connect to the Analysis Service

Add in your User name (In my case the email address) but not the Password.

You add your Office 365 password when you sign into your Account (Next)

Right click on the Analysis Services name and go to Properties, Then Security

Click add, then Search for the app that has been created. This can then be added as a server administrator (Allows it to add models, amend models, refresh data etc)

Click OK

Create the Logic App

Finally, we can create the Logic App in Azure

The Logic App will be triggered by a HTTP Request which will be triggered by Data Factory the Azure Orchestration tool

In Azure Go to Logic Apps and Add

Then New Step and search for HTTP and then HTTP

Method

Post. Post appends the form data inside the body of the HTTP request

URI (Unified Resource Indicator)

URI = https://your server region/servers/aas server name/models/your database name/refreshes

Here is my example:

https://ukwest.asazure.windows.net/servers/adventureworksas/models/AdventureWorksV2/refreshes

The Server is the Analysis Services created in Visual Studio.

The Model is the model that we want to refresh. There may be multiple models on the server

Headers

As per the example I have set to Content-Type and application/json

Queries

Nothing set

Body

This is where you set up al the processing information

{

“Type”: “Full”,

“CommitMode”: “transactional”,

“MaxParallelism”: 2,

“RetryCount”: 2,

“Objects”: [

{

“table”: “DimCustomer”

{

“table”: “DimDate”

{

“table”: “DimEmployee”

{

“table”: “DimProduct”

{

“table”: “DimProductCategory”

{

“table”: “DimProductSubCategory”

{

“table”: “DimSalesTerritory”

{

“table”: “FactResellerSales”

}

]

}

Type: Full

the type of Processing to perform. refresh command types can be

full – Process the Entire Model and recalculate all the dependents (Measures, columns etc)
clearValues – Clear values in the objects and dependents
calculate – recalculate your formulas
dataOnly – refresh the data in the objects, does not force recalculation
automatic – If the object needs refreshing and recalculating then do so.
add- Append data to the partition and recalculate dependents (measures, columns etc)
defragment – Defragmentation option will clean up values in dictionaries that are no longer used

CommitMode: transactional

Determines if objects are committed in batches of when complete. Models include

default
transactional
partialBatch

Its always a good idea to set up Partitions within your data, especially on the Fact table to process the data

However the above is an example specifiying what to process and when. You can simplify the code if you wish

{

“CommitMode”: “transactional”,

“MaxParallelism”: 2,

“RetryCount”: 2,

“Type”: “Full”,

}

The following is an example of adding Partitions into your code. In this example, DimCustomer is one Partition.

“table”: “DimCustomer”,

“partition”: “DimCustomer”

The fact table can be made up of many partitions. For example you could have one for each year. Then you can decide which partitions to process. For example, you may only want to process the current years data.

Authentication

Active Directory OAuth

Tenant

Use the Tenant ID that we collected previously

Audience

https:/*asazure.windows.net

Client ID

Use the Client ID that we collected Previously

Credential Type

Secret (Remember that we added the Secret to the App)

Secret

The Secret ID that we saved (Remember, for this one you cant go back to Azure and look at the secret. Its a one time ID Only)

Save and test the Logic App by Clicking Run

Consume Logic App with Azure Data Factory

We now have a logic app that will incrementally process the data. We now need to schedule this by creating a Data Factory or orchestrate the processing

First we need to copy the HTTP POST URL within the logic App When a HTTP request is received activity

And then create a new Data Factory in Azure

for the time being I am not enabling GIT

In author and Monitor create a pipeline and drag a web activity across to the design pane.

And then set up the schedule by adding a trigger

this should now be enough to process all your data into the Analysis Service model which has a line connection into Power BI.

What we need to do is to check that this is doing its job. i will be looking at this in a later blog post

October 29, 2019October 29, 2019

Create a visual Axis Category from a measure (Create Table with DAX)

Take this data as an example

ID Total
99 1
99 1
101 1
101 1
101 1
333 1
333 1
333 1
333 1
635 1
635 1
635 1
635 1

If you do the usual measure SUM(Total) you would get

ID Total
99 2
101 3
333 4
635 4

But I don’t want the visual to show the above. I want the number of occurences on the axis and then the number of IDs within that occurence as the value

You cant use a measure on the axis, they can only be used as values. How you you solve this problem?

Go to Modelling and New Table

SummaryTable = 
SUMMARIZE (
    'fact Fact',
    'fact Fact'[ID],
    "Frequency", COUNT ( 'fact Fact'[ID]))

SUMMARIZE Returns a summary table for the requested totals over a set of groups. In this case, Our group is the IDs.

For the Summary we are simply counting our IDs as the Frequency

Next, Add a new Table

PotentialFreqs = GENERATESERIES ( MIN (SummaryTable[Frequency] ),max('SummaryTable'[Frequency]), 1 )

This creates a list of Frequencies starting at 1 and Ending at our Max Frequency. there is is issue here. we have a lot of NULL Values that creates a large Frequency number that we don’t want in our series of frequencies. this means we have values from 1 to 98 and there should only be values 1 to 4. How do we remove these values?

SummaryTable = 
SUMMARIZE (    
        'fact Fact',    
        'fact Fact'[ID],    
        "Frequency", IF(
                       'fact Fact'[ID] 
                       <>"0",COUNT 
                       ('fact Fact'[ID]))  
          )

Note the addition of the IF statement. If the ID is not zero then count the IDs. Else do nothing

Create relationships

In order to use the new tables we need to create relationships between them

the Summary table is joined to the fact table by the ID (Business key) within the fact table

PotentialFreqs is joined to the SummaryTable via Value to Frequency

the first visual to try this out is a stacked bar chart. Value from PotentialFreqs table is in the Axis and Frequency from the Summary Table is used as the Value.

So we can see that 33 of our IDs have 3 records in the table.

the PotentialFreqs table is really useful if you want to plot your data in a histogram as it gives you a continuous type to plot against rather than categorical

In conclusion, if you need to use a measure on an Axis rather than a value, Create a summary table and then join to the main tables in your model

Update

Unfortunatly the above solution doesnt quite work because I want to base my metrics against each year. sometimes the IDs are split into Years. I will be looking at this later.