Power BI – reporting on Items NOT in a filter

We are working on a report to look at Power BI activities and objects at the moment.

Lets have a look at the model

This is a test model with a date dimension (When the Activity occured)

  • A Report dimension (Details of the report used in the activity)
  • A Dataset dimension (Details about the dataset used in the activity)
  • And a Fact table of Activity metrics

We have a star schema with single direction joins to the fact. Power BI as we know loves a star.

Lets have a look at my sample data

Date

Just a sample of the date table

DataSet

Report

Fact

DataSet 3 and report 4 have never been part of an activity (Just for this exercise 01/01/2000 is our fake date because there is no activity)

The above logic is easy to do (Which items have never been used) but what happens when you want to look at the business question

Give me all the reports NOT used in September

As we can see from the metrics table report 2 and 4 were not used in September

So how do we create DAX that will allow us to look at these kind of Business Questions?

Base DAX

We want some base DAX measures that we can then build on top of

# reports = DISTINCTCOUNT(ActivityMetrics[ReportKey])
# Datasets = DISTINCTCOUNT(ActivityMetrics[DatasetKey])

And for the reports we never use we can either use the measure above with a filter

Or explicitly add a filter into a new measure

# Reports Never Used = CALCULATE([# reports],DimReport[neverusedFlag]=true)
# Datasets Never Used = CALCULATE([# Datasets],DimDataset[neverusedFlag]=true) 

Notice that I set Edit Interactions to off between the date slicer and the two cards because the cards are simply telling me how many reports and dashboards have never been used

So this part of the process is all fine.

Our Next Business Question is

Show me the reports and datasets used in September by date

This is another straight forward request

And both interact with the Date Filter.

And now to the question

Show me the reports and datasets NOT used in September

So we use our date slicer and we want to see items outside of that date slicer. We need to create a more complex measure to do what we need

Is Not in the Date Filter Flag = 
// assumes a slicer on DimDate[ActivityDateKey] and there is an implicit filter on the visual 
VAR VActivityDates = SELECTCOLUMNS(DimDate,"Dates", DimDate[DateKey]) 
VAR VDates = CALENDAR(MINX(DimDate,DimDate[DateKey]),MAXX(DimDate,DimDate[DateKey]))
RETURN IF(COUNTROWS(INTERSECT(VActivityDates,VDates)),0,1) 
// if no matching rows, return 1

Lets break the above down a little to understand what is happening

SELECTCOLUMNS

VAR VActivityDates = SELECTCOLUMNS(DimDate,”Dates”, DimDate[DateKey])

First of all create a variable. SELECTCOLUMNS

“Adds calculated columns to the given table or table expression.”

We start with the table they come from. then we give the name to the column in SELECTCOLUMNS “ColumnName”, Next comes the expression which in this case is the column reference.

CALENDAR

VAR VDates = CALENDAR(MINX(DimDate,DimDate[DateKey]),MAXX(DimDate,DimDate[DateKey]))

Next comes the VDates Variable. Calendar Returns a table with a single column named “Date” and it contains a contiguous set of dates. We need to provide the range of dates. In this case MINX finds the earliest date and MAXX finds the Maximum date in our date dimension

Now this should work with our slicer. In this case Min 01/09/2021 Max 30/09/2021

INTERSECT

INTERSECT(VActivityDates,VDates)

And now we get to the measure that will be RETURNED

“INTERSECT A table that contains all the rows in table_expression1 that are also in table_expression2”

COUNTROWS

COUNTROWS simply counts the number of rows within a table or an expression

IF

RETURN IF(COUNTROWS(INTERSECT(VActivityDates,VDates)),0,1)

So here we can see the intersect and if working together. If there are rows counted (True – There are rows then 0 else 1.

Still Struggling to understand? Me too. Lets draw it

Here is where I am a little confused. I’m not sure which part of the DAX works with the slicer. SELECTCOLUMNS or CALENDAR MINX MAXX? I’m making the assumption that CALENDAR takes the min and max of the slicer but its incredibly difficult to test so if anyone has any more insight on this that would be fantastic.

And we can now look at everything together in a report

We have used Is Not =1 in the ‘Is not in the Date Filter’ as out value and its always ever 0 or 1

But now we are asked another business question

How MANY reports did we not view in the current date slicer?

with the measure as it stands. It needs the Report Name or the Data set name to provide details

We can’t use what we have in a card because we cant add the implicit filter (Report Name etc).

So How can we do this one?

We can solve it using variables

Total Reports not in date Filter = 
//Clear all filters which are applied to the specified table.ALLCROSSFILTERED
var VAllReports =CALCULATE([# reports],ALLCROSSFILTERED(DimDate))
//Total within the date filtervar 
VInFilter = [# reports]
//All reports - Reports inside the slicer
Return VAllReports-VInFilter

Using ALLCROSSFILTERED brings back the total reports 4 as it removes all the filters

then we get back the total of reports in the date slicer and finally take in filter away from all the reports.

Out # reports is a measure made from distinct report IDs already. We now have everything we need to create these reports thanks to DAX.

Power BI Dataflows, Computed Entities and Query Folding

When you create a dataflow in Power BI Service without using Premium capacity you may hit issues Like ‘cant refresh dataflow

Linked Entities are when you link objects from different dataflows within Power Query Editor. to create transformations over a linked entity you need to create a computed entity that references the linked entity. There are none of these in this particular project but I do have…..

Computed entities are entities that reference linked entities, and which rely on in-storage calculations for performance and scale. Unlike “normal” entities which use the Power Query M engine to extract data from external data sources, and to load it into CDM Folders in Azure Data Lake Storage gen2

https://ssbipolar.com/2018/10/23/dataflows-in-power-bi-overview-part-6-linked-and-computed-entities/#:~:text=Computed%20entities%20are%20entities%20that,Azure%20Data%20Lake%20Storage%20gen2.

Data had been duplicated from the original table in the dataflow, but it had accidentally been created as a reference so it couldn’t be done.

As soon as you reference another query in a dataflow a Computed Entity is created.

This led to a bit of research on differences between Query Editor in Desktop and Service and how you can avoid the above issues.

Its also very possible that the dataflow in Service is slower than in desktop. After advocating the use of dataflows when creating solutions within teams and large scale organisations I wanted to get more information.

To get more understanding on what is going on in the Dataflow:

  • The dataflow is stored as csv data. It may be partitioned into multiple physical files.  
  • Power BI service ingests the contents of all of these files and loads them into memory.
  • It then performs a table-scan on each set of data to do the join or other transformations
  • It doesn’t use indexing of data typing because its schema-less (without the benefit of any indexing or data typing because schema-less.)
  • This is inefficient, and slow.
  • There are possibilities of moving your dataflows into your own Gen2 DataLake to extend capabilities over the data with Azure machine learning

Lets have a look at some Computed entity Examples

Referencing tables

Right click on customer and create reference to create additional the reference table

Note that the new table shows an icon of ⚡

Merging tables

Merging also created computed entities too which means that there will be no way to merge keys into fact tables when doing star schema modelling.

Append

Computed entities also occur with appended queries

So far it would seem that in order for dataflows to be the best option you should be in Premium capacity to make use of computed entities OR your transformations should be done at source.

Dataflows and Query Folding

There are also other things to consider when working with dataflows

It would seem that if you use the dataflow in Service, Power BI Query Editor cant use Query Folding at all.

Query Folding

This happens in Power Query editor to help increase performance. Query folding takes your edits and transformations which are also tracked as native queries in SQL (For example)

This ensures that the queries can done on the original data source without taking up Power BI resources

There are benefits like faster refresh and automatic compatibility with Direct query. This is because the transformations in Power Query Editor are done at source and uses the results to load into power BI tabular model.

You achieve best data refresh performance by ensuring that query folding occurs

https://docs.microsoft.com/en-us/power-bi/guidance/power-query-folding

Before we have a look at Query Folding in a Desktop File lets get some answers on what does and doesn’t support Query Folding. there is no definitive guidance on this so these are only a few examples

What does support Query Folding

  • Filtering (on rows or columns)
  • Aggregates and GROUP BY
  • Joins
  • Pivot and unpivot
  • Duplicating columns
  • Numeric calculations
  • Simple transformations, such as UPPER, LOWER etc
  • Removing columns.
  • Renaming columns
  • Filtering rows
  • Merging queries
  • Appending queries
  • Custom columns with simple logic
  • Pivot and Unpivot

What doesn’t support Query Folding

  • Merge columns
  • Indexes
  • Changing Date to Year, Month name etc
  • Merging queries based on different sources.
  • Appending queries based on different sources.
  • Adding custom columns with complex logic.
  • Changing a column data type.

Desktop Power Query Editor

To look at query folding, the source can’t be a Power BI report over a data file because only a database can support Query folding like Azure SQL DB for example.

Right click on your step. If View Native Query is enabled, the SQL can track the M Query

You can also look at the SQL that has been created

select [].[DateKey] as [DateKey], [].[FullDateAlternateKey] as [FullDateAlternateKey],
[].[DayNumberOfWeek] as [DayNumberOfWeek], [].[EnglishDayNameOfWeek] as [EnglishDayNameOfWeek],
[].[SpanishDayNameOfWeek] as [SpanishDayNameOfWeek], [].[FrenchDayNameOfWeek] as [FrenchDayNameOfWeek],
[].[DayNumberOfMonth] as [DayNumberOfMonth], [].[DayNumberOfYear] as [DayNumberOfYear],
[].[WeekNumberOfYear] as [WeekNumberOfYear], [].[EnglishMonthName] as [EnglishMonthName],
[].[SpanishMonthName] as [SpanishMonthName], [].[FrenchMonthName] as [FrenchMonthName],
[].[MonthNumberOfYear] as [MonthNumberOfYear], [].[CalendarQuarter] as [CalendarQuarter],
[].[CalendarYear] as [CalendarYear], [].[CalendarSemester] as [CalendarSemester],
[].[FiscalQuarter] as [FiscalQuarter], [].[FiscalYear] as [FiscalYear],
[].[FiscalSemester] as [FiscalSemester], [].[FullDateAlternateKey] as [FullDateAlternateKey - Copy]
from [dbo].[DimDate] as [_]

Moving onto the next step. Creating month name from date

Query folding is greyed out. You cant query fold on this step because SQL cant be created to represent creating the month name from a date.

And If the date is duplicated again after this, View Native Query is also greyed out. Once you have done a step that prevents query folding, no more query folding can happen.

This represents quite a complex problem. If you want to use Query folding for faster refresh you need a very in-depth understanding on what you can and cant do with Query folding (Like creating an Index etc) and attempt to do everything first before you introduce steps that will stop query folding.

After developing lots of Power BI reports, I cant even imagine being able to think at this level.

Merge

lets have a look at a step you can do that works with Query folding

Merging your Query allows you to keep Native Query

from [dbo].[DimProduct] as [_]
) as [$Outer]
left outer join [dbo].[DimProductSubcategory] as [$Inner] on ([$Outer].[ProductSubcategoryKey2] = [$Inner].[ProductSubcategoryKey])

So, its really worth doing, especially if you are working with a large data set. However its a difficult one to achieve.

Service Power Query Editor

We have looked at Query folding in Desktop. Now lets look at Query Editor in Service, which is the recommended way to go at enterprise level (dataflows).

When you right click on a Step in Service, the option is not available.

You cant query fold in the dataflow and there currently doesn’t seem to be any plans to do so.

Tips

  • Anything that cant be query folded would be better off created within your SQL Database reporting area
  • Look through all your applied steps. find where Native query option is disabled. Are there any steps after this that could use Query folding. It may be worth trying to restructure the order of steps
  • If you use dataflows you cant do any Query folding which needs to be an accepted negative for going the dataflow route
  • If you cant query fold there is a possibility of using Native Queries. However be aware that once you do this, no query folding can happen if you are in Desktop.
  • .xlsx, .csv, .txt files (etc) cant use query folding but don’t tend to be large data sets
  • Do as much transformational processing as you can at source. for example a SQL Reporting database

Power BI Dataflow issues. Let the whole dev team know

Currently, if your dataflow fails the only person who will be notified is the owner of the dataflow.

We want all our developers within the team to know. There doesn’t appear to be any way to do this at the moment but there is a workaround that was suggested to me by on the Power BI Forums by collinq as an idea starter and I thought I would run with it and see what happens.

It all relies on a Refresh date

Refresh date

In my main dataflow I have the following Query

This was created from a blank query

let  Source = #table(type table[LastRefresh=datetime], {{DateTime.LocalNow()}})in  Source

This gets updated every time there is a refresh on the main dataflow

Create a New Report in Power BI Desktop

Go to Power BI desktop and Get Data

Dataflow is the source of the data

And we only need this object

I am going to edit the Query in Power Query Editor

Last Refresh date has been split into Date and time.

Then a custom column was created for todays date

DateTime.LocalNow()

This was split into Date and Time. It is very likely that we may decide to use time later so this is why it has been added for now.

Now we find the number of days between the last refresh and today

0 -Duration.Days(Duration.From([last Refresh Date]-[Todays Date]))

0- is added to remove the minus at the start of the number so -1 becomes 1

Close and Apply and a card is added

Publish to Service

Power BI Service

Go to the report that has just been created

And Pin the card to a dashboard. In this case, an Issues dashboard has been created

The idea at this point is. If the Refresh date isn’t refreshed the number of days between will become 2 because todays date will change and we can be notified

This will need testing

Power BI Dashboard (Create Alerts)

Go to the dashboard

Choose Manage Alerts

We want to know if it goes above 1. this means that last nights refresh has failed to happen.

(But if it fails to happen, will the Current date refresh?)

At present, an alert will only be sent to the owner of the report and it will be seen within Power BI but we want everyone to know.

This is why we are going to use Power Automate / Logic Apps

We have an Azure Subscription so I can add a Logic App within our subscription

Instead of clicking the above link we are going to go into Azure, but the principal will be the same

Save the Alert

Schedule a refresh

The new report has created a dataset

go to Settings and Scheduled refresh to keep this up to date

Azure Logic Apps

Create a new Logic App in Azure

Search for Power BI. We want the trigger ‘When a data driven alert is triggered(Preview)

I am signing in with my own credentials (Note that my password is updated every month so if these credentials are used it will need adding into the governance.

Our alert has been saved and added to the alert list

Just for the time being its being left at 3 times a day

We have our trigger, now we need to know what will happen on the trigger

New Step

for the time being chose a fairly easy option of sending an email

You can search for the dynamic content as you create the body and subject. Here we want to bring to attention the value in the tile and the alert threshold.

  • The HTML <li> element is used to represent an item in a list
  • The <strong> tag is used to separate the text from the rest of the content. Browsers traditionally bold the text found within the <strong> tag
  • The <big> tag is used to make the text one size bigger
  • The <ultag defines an unordered (bulleted) list
  • The <a> tag defines a hyperlink, which is used to link from one page to another. The most important attribute of the <a> element is the href attribute, which indicates the link’s destination.

I have added two users to the email so they can both be notified

Save your logic app. Its ready

Testing the New Process

The dataflow is schedule to refresh at 11 PM

Dataflow issues data flow is scheduled at 12 AM

On the night of the 28th of September, everything failed. I got the emails because I am the dataflow owner but no email from the actual set up.

Testing has failed

lets have a look to see whats happened.

We have two fails, and one on the dataflow we have set up

It looks like the refresh token expired. Please go to this dataset’s settings page, and reenter the OAuth2 credentials for the Extension data source.

Going into the report and we still see this

Which is incorrect.

We would get more of an understanding if we could match up the dates to what is happening.

However its clearly not updated

Dataflow Settings

Scheduled refresh is on and set to run at 12 midnight. The errors were emailed through just after 11.

The alert is there.

lets go back to desktop and add some more information.

After a refresh in Desktop we can now see this information

Which is correct. this says to me that even though we have a refresh going on, it didn’t refresh in Service possibly. The new report is published up to Service.

Back in Service

This is interesting. Our new multi row card shows the correct information. However our card still says 1 day which isn’t correct.

A quick refresh of the data set and we can still see one on the card so we have a difference in between Service and Desktop.

Refresh of the report and now its worked and we can see 2 days difference

So there are a few issues here. Why did it not refresh the card on the data set refresh but it did when the actual report was refreshed?

Its actually the dashboard that is doing the work here. the new multi card is pinned to the dashboard. lets go and have a look at it.

The dashboard only updated once the new visual was pinned to it

So the failure has been that the report and dashboard didn’t refresh, even though it is set to refresh.

You can get to the data set refresh history in Data sets and then Refresh History

And you can get to the Dataflow refresh history via Dataflows

Data Set Issues Refresh History

Dataflow Issues Refresh History

The actual Schedule seems to be fine. All I can think of is that possibly at 12 it is still 1 day so I could possibly introduce more data refreshes to the dataflow issues data set

Test 2 Adding more refreshes on the Dataflow issues data set

Its a very quick refresh because its just two dates. Lets see if this changes things.

Power BI August 2020 Updates Perspectives for Personalised visuals in Preview

Personalised visuals have been out for a while (The ability for report consumers to change the visual to more suit their needs), but we need to make sure that users aren’t over whelmed by field options.

Lets look at a quick demo using Adventureworks data.

First of all, you need to make sure Personalised Views are enabled in Options

Enable Personalised views

Its not enough to turn the preview feature on. You have to enable it. for this example its going to be done in Power BI Desktop

Personalised visuals has been set up at current file level because there may be other reports that you don’t want to enable this on.

First of all create a simple visual

Note the new icon which denotes that it can be personalised.

This visual gets published up to your App Workspace. Then published to the Power BI App for your report consumers

If you have a lot of fields you might not want to overwhelm the users with them in Axis and Legend as above.

Tabular Editor

Back to Desktop and we can open up Tabular Editor (Ensure you have the latest version downloaded)

You are immediately connected to the tabular data model and can see the tables and the empty Perspectives folder.

Right click on Perspectives and choose New Perspective and name the perspective

Go to tables and fields then right click and add to perspective

When you Show an item in perspectives, it doesnt change the perspective. Which is a little annoying because you cant see what you have added. If you try adding the item again it will be greyed out which is the only way of knowing at this point.

  • It would be helpful to see the list of items that you have added

Save in Tabular Editor and also in your Power BI desktop file

Back to Power BI Desktop

Without selecting a visual click on Format for the entire report

There is a new Personalize Visual menu item which can now be set to the perspective Created in Tabular Editor

Hopefully soon we will be able to select visuals to personalise and select perspectives at visual level rather than at report level to further lock down what should be seen per visual. This would be most helpful.

Save and Publish to Service and your user will only be able to work with those items in the perspective

Power BI Service

Now only items available in the perspective are available to the user

Future hopes for Personalised Visuals

  • make it easier to see the list of items in the perspective in Tabular Editor
  • Set Personalised at Visual level rather than report level
  • It would be even better to have them set at Axis, Legend and Values level. You could make all measures available to the Values Perspective only which would be really helpful.

So a great start but I hope they introduce the functionality above to make this new functionality even better for our users

Power BI table Visuals, and the STAR Schema. Ensuring you have a connection between your Visuals

Working with users who like a lot of Table visuals in their reports, there is sometimes a tendency to not want to include the measure in the table visual which can result in the following annoying situation

Using another really basic example

the Fact is about Events and the join is between EventtypeKey and DateKey

Our Visual contains the Event Description from the Event Dimension. this visual could contain data items from multiple dimensions but the thing to understand here is that these dimensions are not part of the column chart visual

The Column visual contains Month name from the Date Dimension and the Number of events metric

Number of events = DISTINCTCOUNT('Fact'[EventKey])

If we click on a column the hope is that the table visual interacts and we only see Events in that month.

Not the case. You can click on a bar and you still see every event description. In this instance our join is not helping. How you you resolve this?

Adding the measure into the table visual

The connection is now there because of the measure so this works. But the user doesn’t want the measure in the table visual.

So what Next

Add a Filter to the table visual

If we ensure that we only show items where that specific measure is not blank we can finally see the visuals interact.

A simple filter and a bit of thought over the data model and the joins has resolved the issue and now we can have tables with data from other dimensions that aren’t used in the visuals

Create your website with WordPress.com
Get started