October 2019 – Debbies Microsoft Power BI, SQL Fabric and Azure Blog

October 29, 2019October 29, 2019

Create a visual Axis Category from a measure (Create Table with DAX)

Take this data as an example

ID Total
99 1
99 1
101 1
101 1
101 1
333 1
333 1
333 1
333 1
635 1
635 1
635 1
635 1

If you do the usual measure SUM(Total) you would get

ID Total
99 2
101 3
333 4
635 4

But I don’t want the visual to show the above. I want the number of occurences on the axis and then the number of IDs within that occurence as the value

You cant use a measure on the axis, they can only be used as values. How you you solve this problem?

Go to Modelling and New Table

SummaryTable = 
SUMMARIZE (
    'fact Fact',
    'fact Fact'[ID],
    "Frequency", COUNT ( 'fact Fact'[ID]))

SUMMARIZE Returns a summary table for the requested totals over a set of groups. In this case, Our group is the IDs.

For the Summary we are simply counting our IDs as the Frequency

Next, Add a new Table

PotentialFreqs = GENERATESERIES ( MIN (SummaryTable[Frequency] ),max('SummaryTable'[Frequency]), 1 )

This creates a list of Frequencies starting at 1 and Ending at our Max Frequency. there is is issue here. we have a lot of NULL Values that creates a large Frequency number that we don’t want in our series of frequencies. this means we have values from 1 to 98 and there should only be values 1 to 4. How do we remove these values?

SummaryTable = 
SUMMARIZE (    
        'fact Fact',    
        'fact Fact'[ID],    
        "Frequency", IF(
                       'fact Fact'[ID] 
                       <>"0",COUNT 
                       ('fact Fact'[ID]))  
          )

Note the addition of the IF statement. If the ID is not zero then count the IDs. Else do nothing

Create relationships

In order to use the new tables we need to create relationships between them

the Summary table is joined to the fact table by the ID (Business key) within the fact table

PotentialFreqs is joined to the SummaryTable via Value to Frequency

the first visual to try this out is a stacked bar chart. Value from PotentialFreqs table is in the Axis and Frequency from the Summary Table is used as the Value.

So we can see that 33 of our IDs have 3 records in the table.

the PotentialFreqs table is really useful if you want to plot your data in a histogram as it gives you a continuous type to plot against rather than categorical

In conclusion, if you need to use a measure on an Axis rather than a value, Create a summary table and then join to the main tables in your model

Update

Unfortunatly the above solution doesnt quite work because I want to base my metrics against each year. sometimes the IDs are split into Years. I will be looking at this later.

October 22, 2019November 28, 2019

Power BI October 2019 New Visuals and Publish to Web

To start to have a look at these new visuals I have downloaded some crime stats for my local area

I want to try out some of the new visuals in Power BI in the October Release whilst also doing my first public upload of a report. the visuals in the XViz suite can be used for free, you can purchase them with full functionality. I am using the Free Versions of these visuals

Hierarchy Variance Table (xViz Visual)

Compare measures within a hierarchical table and see the variance.

the first thing I need is a hierarchy which I created using Crime and Status.

E.g. Possession of Weapons

Offender Fined
Court Result unavailable
Under Investigation

I went to the market place and selected the new hierarchy Variance table

to do that I simply dragged Outcome Status over category to create a hierarchy.

Crime is simply a metric I added 1 for each row in order to Sum

Time period is our date Column

Note all the information you get from this new visual. A spark line and an indicator along with the metrics that the visual creates for the visual

In Formatting you can also go into format Number formatting and switch on Semantic Formatting that colours the cells based on positive or negative values

You can also right click on the items in category and Get access to options like what to display, sort and rearrange the hierachy.

So what does this tell us? I’m going to look at Robbery.

the first thing to note is that if you have a Filter on Year, If you select a Year the data disappears and you are left with just Baseline. If you have a Year Filter, remember to ensure Interaction is off for this visual.

we have 9 records in total. It seems to be taking Baseline as the total for 2018

For target it seems to be taking the data from the first quarter of 2019. This doesnt make sense to pitch 2018 against 2019 First Quarter

Lets see if Vehicle Crime makes any more sense

Again baseline appears to be the total for 2018. target of 4 could be Quarter 1 and Quarter 2? Missing Quarter 3. So the question is, How does it arrive at the baseline and target?

Unfortunately as yet, there is nothing on line that explains how these metrics are calculated which means we have no way of knowing at the moment. Until this information is clearly provided I would say that this visual is not fit for purpose.

I certainly wouldn’t want to demo the visual for a customer to say, what does baseline and target relate too? At present I wouldn’t be able to answer that question. If you have an idea please let me know.

Linear Gauge Table (xViz Visual)

Compare a Value (resolved) to a target goal (% Number of Crimes) Ideally we want 95% of these Crimes resolved in the data set

This is very straightforward. For Actual value I want resolved crimes and the maximum value will be the number of Crimes.

I have also set Condition to On in formatting where under 60% is red. everything over 90% is blue and everything in between is amber.

Straight away we are in the red for number of crimes solved

Key Influencers

Its not new but I thought Key influencers would be a great visual for this report. I have added a conditional column against outcome to ensure we only had a few conditions to choose from

For No action taken, No suspect identified is the top influence. Further questions could be, why is this the case? Should we not be doing better on finding the person who committed the crime?

Publish to Web

I now have all the visuals I need to get the information out their. its not private data so I can publish to the web. I want to publish to this wordpress site

First of all, Lets publish to My Workspace in Power BI Service

then log in to check that the Report is in My Workspace

Publish to Web

Once there, go to File and Publish to Web

For the time being I have copied the Link and HTML

Install WordPress Plugin

You can install the WordPress plugin for PowerBI

However you need to be on the Business Plan to do this. I’m not on the Business plan. Its £20 per month.

So to end this post, here is the link to the public report

https://app.powerbi.com/view?r=eyJrIjoiZTY5YWZkOTctM2NkOC00MGZmLThkMjctYjAzOWRjMjdmNGE3IiwidCI6IjE2OGMxZmUzLWE4NDEtNDliNS1iNjkyLTdiMzEzMmMwYTk5NyIsImMiOjh9

The Key Influencers isnt available in Publish to Web Format

You should always be aware when publishing to we that there is no security in place so never publish to web any information you are not happy for everyone to see

October 18, 2019

Peak Indicators Article on Microsoft Azure

I just wanted to link to an article that I helped write with Paul Clough on working with Azure

https://www.peakindicators.com/blog/spotlight-on-microsoft-azure-does-it-live-up-to-it-s-reputation-what-is-it-like-to-work-with

It was a real joy to write and I mention my work with Social Media Analytics

October 15, 2019

Power BI Streaming Data sets (And Push Data Sets) Part 2

We are going to add some new information into the data set

In Power BI Service, Click Edit … against the #Taskmaster dataset

I’m actually going to add 7 more fields

Description, Sentiment and Name are Text
CreatedAt is a date
Count, Favorited and followersCount are Numbers. Click Done

Return to Microsoft Flow

Edit the flow and go to add rows to a dataset

Count is an Expression
It is simply set as 1

You can Test and Save

Any new data that comes through will contain these data items

Creating sentiment is slightly more involved but lets go ahead and create sentiment

After Detect Sentiment update your flow with the following logic

Create Sentiment which is Neutral. If Score is over 0.7 then Positive. If under 0.3 then negative

We can add this to our data set

We now have lots of new information to use in our streaming reports

Back in Power BI Service

I’ve attempted to add a new Custom Streaming Tile to the dashboard based on a Line chart to look at the count of records

Unfortunately this Streaming visual doesn’t seem to work and immediately I can see a fatal flaw with using streaming data set visuals for this kind of data

These visuals are for data that pulls through almost constant stream of data. They are not for data that has a feed that at some points doesn’t have much data coming through. You need to have lots of data in the time frame of the streaming dataset (For example 60 minutes).

I have the #Taskmaster Stream set up for Historical Data Analysis

when Historic data analysis is enabled, the dataset created becomes both a streaming dataset and a push dataset

A streaming dataset, has no underlying database

A Push data set

For a push dataset has a few limitations on how much data can be pushed in:

75 max columns
75 max tables
200,000 max rows stored per table in FIFO dataset
5,000,000 max rows stored per table in ‘none retention policy’ dataset

The data is then stored in Power BI and we can actually access it from Desktop.

This means I can create some really nice reports against my taskmaster data set

You cant create Hierarchies of calculated columns over the Push data set. Only Measures. You are very limited on what you can do. The Key Influencers visual is also unable to work with push data which is a shame because this is the perfect visual for analysing positive and negative tweets

I should have brought date across as a date in the first instance because Month is now just a number and I cant change this in Power BI. We have date but only against the very latest data

Time was already in the data set but this consists of date and Time which you cant reset to create a date hierarchy

I cant add day names to the day number(Taskmaster is on on Wednesday so I expect the levels to go up then)

So the push data set is fairly simple to set up but its incredibly limiting to what you can do

Our initial page by Month. I needed to Add Month Name into the Flow

Next I drill through on the 9th of October. the name of the day would be great because taskmaster is on a Wednesday

Finally we drill through to the hourly tweets.

As you can see, We have a few negatives in the data set which, on reading aren’t actually negative. The Cognitive API doesn’t recognise sarcasm for instance.

There you go, We started out with a streaming data set and ended up with a push data set.

The push data set doesn’t need a refresh which is them main reason to go for them.

One last quick thing about this push dataset. When I attempted to republish the report I got the following error

To get past this error I had to delete my report in Power BI Service before trying again. I have never seen this kind of conflict before and I’m assuming its an issue with the Push data set

October 7, 2019October 7, 2019

Power BI Service Data Lineage View

I was logging into Power BI this morning when I saw this exciting new feature

we are always looking at new solutions to provide good data lineage so this is well worth a look

Data lineage includes the data origin, what happens to it and where it moves over time. Data lineage gives visibility while greatly simplifying the ability to trace errors back to the root cause in a data analytics process.
Wikipedia

I have an App workspace set up for Adventureworks so lets have a look at Lineage using this project

Column 1 is my data source. I can see I’m using a local database and I’m also using an xlsx spreadsheet to bring in data.

In most of my projects I’m working on the ETL in Data factory, transforming data in Stored Procedures etc. for example, for a social media feed, I have a logic app that moves tweets to an Azure Data Storage NOSQL table. Data Factory then transfers this data across into a central Azure Data Warehouse. The Power BI Lineage would pick up at the data Warehouse stage. It wont take into account that there is a lot of work previous to this

Column 2 is the data set in Power BI

Column 3 provides Report information

Column 4 displays the Dashboards

You can click on a data flow node to drill down into more detail

Currently you cant go any further to look at the data items

Click on the Link icon to see the data flow for that item. In this case the Report.

This is a great start but there definitely needs to be more information here to make it something that you would want to use as a proper Data Lineage Tool

It would be good to see the fields in each Entity for the Data Sets
As an extra, it would be great to see what fields are being used in Measures and calculated Fields
Reports – For me, Id like to know for every page in my report
- What field am I using from the data source
- What calculated columns I have created (Even better with the DAX Logic)
- Any Name changes from Data Source to Power BI
- What measures I have created (Even better with the DAX Logic)
For the Dashboard, What items I am using in the dashboards (Fields, Measures, Calculated Columns
An Important part of data lineage is getting and understanding of the entire process. This includes data transformations pre Power BI. If you cant do that in here, it would be great to be able to extract all the information out so you can use it in some way with your other Linage information to provide the full story. for example:

Azure Data Catalogue

Azure Data Catalog is a fully managed cloud service. Users can discover and consume data sources via the catalog and is a single , central place for all the organisation to contribute and understand all your data sources.

https://eun-su1.azuredatacatalog.com/#/home

I have already registered Our Data Catalog, and I have downloaded the desktop app

As an Example I want to connect to Azure Table Storage (Connect using Azure Account name and Access Key)

At this point I’m registering everything in the storage table. then I can view the information in the Azure Portal.

You can add a friendly Name, description, Add in expert (in this case me). Tags and management information

I have added Data Preview so you can view the data within the object. there is also documentation and Column information to look at

In the data catalog you can manually add lots of description to your tables along with documentation.

This is great for providing lots of information about your data . You can explore databases and open the information in other formats (Great if you need to supply information to another Data lineage package

I will be having a look at the Azure Data catalog in more detail later to see how it could help to provide full data lineage

Azure Data Factory

Data factory is the Azure ETL Orchestration tool. Go into Monitoring for Lineage Information. However, there doesn’t seem to be a way to export this information to use. Data Factory wont take into account the work done in, for example a stored Procedure

Again this is another area to look into more.

Stored Procedures

When you use Stored Procedures to transform you data, its harder to provide automated Linage on your code. There are automated data lineage tool for SQL out there, but it would be great if there was a specific Tool within Azure that creates Data Lineage information from your Stored Procedures

Azure Logic Apps

Data for my project is collected via Logic Apps before being Processed into an Azure Data Warehouse.

Essentially, we need out data lineage to capture everything all in one place.

And just as important. everything should be as automated as possible. If I quickly create a measure, the data lineage should reflect this with no manual input needed (Unless you want to add some description to the new measure as to why it was created)

October 4, 2019November 18, 2019

Power BI – Dax – RANKX how to use Rank X with 2 values, Filters and Slicers in a Matrix

I have been struggling to implement RANKX into my Reports. I have 3 tables

The date table is used for lots of other reports and goes up to the end of 2019

Our facts for this particular report only go up to March

We want to rank the Average Fact against the Group by Month. The report will have a slicer on Year

We also don’t want to rank against the full set of groups. Only a subset of them so I am applying a filter against group for three of the groups

The original Ranking DAX used

Rank = RANKX(CROSSJOIN(ALL(‘dim Date'[date].[Month]),ALL(‘dim'[Group])), [Avg Fact],,DESC)

RANKX – RANKX is an Iterator. It takes a table and an expression to evaluate. RANKX looks at each row in the table and running its evaluation which is to return the ranking for each row in the table argument. RANKX creates a row context because it’s an iterator

Row Context – Calculated at processing time instead of at run time. This is a calculated column rather than a measure because the data is set on each row. A good example of a Row context calculation is, Is this value greater than 100.If yes Set to True. If no set to False. This is applied on each row

CROSSJOIN – Cross join allows you to recreate a table from all Tables and Columns in the cross join

ALL – Returns all the rows in the table, ignoring any applied filters

So essentially You are ranking against ALL months in the date and All groups within the other dimension table. We are ranking the Avg Fact measure

Now this works to some extent if you add in the table with the date filter set but no filter on the groups you want. Looking at January for example

However Later on in the data set the null values are set as

AND when I apply the filter of only having certain groupings in the table the RANKX fails even further

Note that the RANKX is still ranking every single value, even though we have applied a filter for the Group

This is because of the ALL. The Cross Join takes ALL month Values and ALL Group Values, and dismisses the fact that we have applied a filter

You need ALL because if you just ranked against the 1 value that the row context is on, all ranking would be 1.

Rank = RANKX(CROSSJOIN(ALLSELECTED(‘dim Date'[date].[Month]),ALLSELECTED(‘dim'[Group])), [Avg Fact],,DESC)

ALLSELECTED is different to ALL because ALL calculates everything ignoring filters. ALLSelected Takes into account the filter on the visual. An important part of the solution is that we are slicing on Year and we are filtering for specific groups. As a consequence we need to use ALLSELECTED

Remember we are ranking the measure against each Group. What is the best Group this Month?

Next stop, ensuring we don’t rank against null values

Rank =

IF(

NOT ISBLANK( ‘fact'[Avg Fact]),

RANKX(CROSSJOIN(ALLSELECTED(‘dim Date'[date].[Month]),ALLSELECTED(‘dim'[Group])),

[Avg Fact],,DESC)

)

We have wrapped up the original RANKX into An IF block

NOT ISBLANK (If our measure its not blank then assign the RANKX function

This is great, Its working in the table because we can see in February ‘Age’ is top ranked and ‘Digi’ is bottom ranked.

Changing the DAX Query issues from table to matrix

However , if the data is to be displayed in a matrix we may not want to rank in this way at all

As you can see, in January we rank against each group. But what we could require is to Rank the best month for a group.

So we have created a rank to rank each group in a Month. NOT to rank the best month for a group.

As yet I haven’t come up with a solution to this one so the mystery continues but once I have a solution I will certainly create a post about it.

If you have a solution please let me know

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Month: October 2019

Create a visual Axis Category from a measure (Create Table with DAX)

Create relationships

Add a help video to your Report visuals to engage users

Convert wmv to jpeg

Back to Power BI

Peak Indicators Article on Microsoft Azure

Power BI Streaming Data sets (And Push Data Sets) Part 2

Back in Power BI Service

Power BI Service Data Lineage View

Azure Data Catalogue

Azure Data Factory

Stored Procedures

Azure Logic Apps

Power BI – Dax – RANKX how to use Rank X with 2 values, Filters and Slicers in a Matrix

Changing the DAX Query issues from table to matrix

Create relationships

Convert wmv to jpeg

Back to Power BI

Hierarchy Variance Table (xViz Visual)

Linear Gauge Table (xViz Visual)

Tag Cloud (xViz Visual)

Key Influencers

Publish to Web

Publish to Web

Install WordPress Plugin

Back in Power BI Service

Azure Data Catalogue

Azure Data Factory

Stored Procedures

Azure Logic Apps

Changing the DAX Query issues from table to matrix