Design a site like this with WordPress.com
Get started

Power BI DAX – Create a percentage across days of week (ALLEXCEPT. ALL. DIVIDE)

We just looked at the following request

We have sold specific quantities of products. Can we look at a week for a year in a slicer and see the percentage of items sold for each day. Which days do we sell the most products overall?

We have the following model (Just the parts that will be required for this exercise

And thanks to the Power BI Forums for giving me some help on figuring this out.

What we want to end up with is:

  • A year Slicer
  • Product category on rows of a matrix
  • The Day Name from Date on the Columns of the Matrix
  • And % Quantity for that day based on all the days of the week as the value

Power Query Editor

At first I believed that I needed to add some kind of Ranking order for the DAX to use and as this will be a column its better to do in Power Query Editor (Or in the Sort DB)

To speed things up I created a Year Week Column in SQL DB consisting of the following examples

  • 202101
  • 202102
  • 202135

So the weeks 1 to 9 was padded out with a zero. I then formatted this to a number field and called it Year Week. Every seven rows within that week in the date table will have the same Year Week Number Applied

Measure

I can now create the measure. Lets have a look at it in a bit more detail

Order Quantity by Week % = Var varOrderQtyWeek = CALCULATE(SUM(FactResellerSales[Order Quantity]), FILTER(ALL(DimDate), DimDate[Year Week]))
RETURN DIVIDE(SUM(FactResellerSales[Order Quantity]), varOrderQtyWeek)

And this measure is then set as a percentage.

First of all we create a variable. It gets the Sum of Order Quantity and it filters by using an ALL on the column Year Week we just created in the Date Table

“ALL Returns all the rows in a table, or all the values in a column, ignoring any filters that might have been applied (On Year Week).”

And we return the sum of order Quantity (Which in the context is just for, for example Friday and Accessories by the Sum of Order Quantity

This appears to work

We have Implicitly filtered by Product Category and Day Name of Week.

the question here is, did we even need to set ALL on Year Week column in date. Could we have just said use the entire date

Order Quantity by Week % = Var 
varOrderQtyWeek = CALCULATE(SUM(FactResellerSales[Order Quantity]), ALL(DimDate)) 

RETURN DIVIDE(SUM(FactResellerSales[Order Quantity]), varOrderQtyWeek)

This is working in the same way and makes sense to me. We are using all the dates in the date table. And removing the Filter will create a faster DAX query.

Its looking at the entire data set and we can see that everything totals to 100%. So for us, taking into account all years Thursdays look like a good day especially for accessories.

However we don’t want to analyse the full data set and when we add in a year slicer the logic fails

As you can see, the story is still true but its telling the wrong story. Now for Accessories we have 31% of sales happening in 2021, attributing to smaller percentages across the days.

So we want to change the DAX to accept the Year Slicer

Order Quantity by Week % = 

Var VarALLEXCEPT = CALCULATE(SUM(FactResellerSales[Order Quantity]), ALLEXCEPT(DimDate,DimDate[CalendarYear]))

RETURN 
DIVIDE(SUM(FactResellerSales[Order Quantity]), VarALLEXCEPT)

And this appears to now work just taking the year 2012 into consideration because we are using ALLEXCEPT

Thursday is definitely a good day.

“ELLEXCEPT Removes all context filters in the table except filters that have been applied to the specified columns.”

So we create a variable to get the Sum or Order Quantity with ALLEXCEPT our Date Slicer which in this case is the calendar year slicer

So we are using 2012 from the date Dimension in the slicer. And Bikes from Product and Thursday as the Week Name from date in the Matrix

We Divide the Sum of Quantity which has all the filters applied. Bikes. Thursday. 2021

By the Sum Of Quantity with all, Except we do apply the Calendar Year.

  • DIVIDE(50/100)
  • 50 is Bikes on a Thursday in 2012
  • 100 is Bikes in 2012

DAX is something I do struggle with a little. You think you understand it. Turn your head from the screen and its gone again.

But hopefully this goes a little way to understanding how this DAX in this specific context has been approached

Advertisement

Power BI – reporting on Items NOT in a filter

We are working on a report to look at Power BI activities and objects at the moment.

Lets have a look at the model

This is a test model with a date dimension (When the Activity occured)

  • A Report dimension (Details of the report used in the activity)
  • A Dataset dimension (Details about the dataset used in the activity)
  • And a Fact table of Activity metrics

We have a star schema with single direction joins to the fact. Power BI as we know loves a star.

Lets have a look at my sample data

Date

Just a sample of the date table

DataSet

Report

Fact

DataSet 3 and report 4 have never been part of an activity (Just for this exercise 01/01/2000 is our fake date because there is no activity)

The above logic is easy to do (Which items have never been used) but what happens when you want to look at the business question

Give me all the reports NOT used in September

As we can see from the metrics table report 2 and 4 were not used in September

So how do we create DAX that will allow us to look at these kind of Business Questions?

Base DAX

We want some base DAX measures that we can then build on top of

# reports = DISTINCTCOUNT(ActivityMetrics[ReportKey])
# Datasets = DISTINCTCOUNT(ActivityMetrics[DatasetKey])

And for the reports we never use we can either use the measure above with a filter

Or explicitly add a filter into a new measure

# Reports Never Used = CALCULATE([# reports],DimReport[neverusedFlag]=true)
# Datasets Never Used = CALCULATE([# Datasets],DimDataset[neverusedFlag]=true) 

Notice that I set Edit Interactions to off between the date slicer and the two cards because the cards are simply telling me how many reports and dashboards have never been used

So this part of the process is all fine.

Our Next Business Question is

Show me the reports and datasets used in September by date

This is another straight forward request

And both interact with the Date Filter.

And now to the question

Show me the reports and datasets NOT used in September

So we use our date slicer and we want to see items outside of that date slicer. We need to create a more complex measure to do what we need

Is Not in the Date Filter Flag = 
// assumes a slicer on DimDate[ActivityDateKey] and there is an implicit filter on the visual 
VAR VActivityDates = SELECTCOLUMNS(DimDate,"Dates", DimDate[DateKey]) 
VAR VDates = CALENDAR(MINX(DimDate,DimDate[DateKey]),MAXX(DimDate,DimDate[DateKey]))
RETURN IF(COUNTROWS(INTERSECT(VActivityDates,VDates)),0,1) 
// if no matching rows, return 1

Lets break the above down a little to understand what is happening

SELECTCOLUMNS

VAR VActivityDates = SELECTCOLUMNS(DimDate,”Dates”, DimDate[DateKey])

First of all create a variable. SELECTCOLUMNS

“Adds calculated columns to the given table or table expression.”

We start with the table they come from. then we give the name to the column in SELECTCOLUMNS “ColumnName”, Next comes the expression which in this case is the column reference.

CALENDAR

VAR VDates = CALENDAR(MINX(DimDate,DimDate[DateKey]),MAXX(DimDate,DimDate[DateKey]))

Next comes the VDates Variable. Calendar Returns a table with a single column named “Date” and it contains a contiguous set of dates. We need to provide the range of dates. In this case MINX finds the earliest date and MAXX finds the Maximum date in our date dimension

Now this should work with our slicer. In this case Min 01/09/2021 Max 30/09/2021

INTERSECT

INTERSECT(VActivityDates,VDates)

And now we get to the measure that will be RETURNED

“INTERSECT A table that contains all the rows in table_expression1 that are also in table_expression2”

COUNTROWS

COUNTROWS simply counts the number of rows within a table or an expression

IF

RETURN IF(COUNTROWS(INTERSECT(VActivityDates,VDates)),0,1)

So here we can see the intersect and if working together. If there are rows counted (True – There are rows then 0 else 1.

Still Struggling to understand? Me too. Lets draw it

Here is where I am a little confused. I’m not sure which part of the DAX works with the slicer. SELECTCOLUMNS or CALENDAR MINX MAXX? I’m making the assumption that CALENDAR takes the min and max of the slicer but its incredibly difficult to test so if anyone has any more insight on this that would be fantastic.

And we can now look at everything together in a report

We have used Is Not =1 in the ‘Is not in the Date Filter’ as out value and its always ever 0 or 1

But now we are asked another business question

How MANY reports did we not view in the current date slicer?

with the measure as it stands. It needs the Report Name or the Data set name to provide details

We can’t use what we have in a card because we cant add the implicit filter (Report Name etc).

So How can we do this one?

We can solve it using variables

Total Reports not in date Filter = 
//Clear all filters which are applied to the specified table.ALLCROSSFILTERED
var VAllReports =CALCULATE([# reports],ALLCROSSFILTERED(DimDate))
//Total within the date filtervar 
VInFilter = [# reports]
//All reports - Reports inside the slicer
Return VAllReports-VInFilter

Using ALLCROSSFILTERED brings back the total reports 4 as it removes all the filters

then we get back the total of reports in the date slicer and finally take in filter away from all the reports.

Out # reports is a measure made from distinct report IDs already. We now have everything we need to create these reports thanks to DAX.

Power BI Can you use an inactive join for a slicer (USERELATIONSHIP)

Imagine the scenario. You are dealing with two dates. date A and B. You create an active and an inactive relationship to your date table.

You can create Dax with USERELATIONSHIP for any measures that use the inactive join. Great for the bar charts, line charts etc.

But what if you have a slicer for Year on the report? At the moment that slicer is connected to Date A because you simple drag year from the date table.

This report needs to be sliced against Date B. Is there a way to do this?

Lets look at one of our measures that uses USERELATIONSHIP

YTD Measure =

CALCULATE(TOTALYTD([Number of Records],’Date'[Date]), USERELATIONSHIP(Metrics[DateKeyB],’Date'[DateKey]))

The USERELATIONSHIP can only be used with CALCULATE,CALCULATETABLE, CLOSINGBALANCEMONTH, CLOSINGBALANCEQUARTER, CLOSINGBALANCEYEAR, OPENINGBALANCEMONTH, OPENINGBALANCEQUARTER, OPENINGBALANCEYEAR, TOTALMTD, TOTALQTD and TOTALYTD 

From this it is clear you can only use it in a measure with a metric.

And you cannot use a Measure in a slicer. Slicers have to be created from a column.

Which means that our inactive dates can’t be used in a slicer. And this is problematic because if you have a page of time based visuals all using the USERELATIONSHIP then its a fair shout that you are going to have a slicer too for example ‘Year’

So can we come up with a workaround?

Workaround 1. Adding date information to the fact table

Out model rules are

  • DateKeys in the Fact table. I also hold the dates hidden just in case
  • The dateKey Joins to the date Dimension which contains Year, Day, Month etc

So what I am going to do in this case is add my relationship B details into the fact table just for the time being.

s.[DateB] AS [Date B],
--Workaround 1
DATEPART(Year,s.[DateB]) AS [Year B],
DATENAME(Month,s.[DateB]) AS [Month B],
DATEPart(Month,s.[DateB]) AS [Month No B],
[staging].[fnQuarter](s.[DateB]) AS [Quarter B],

I have created a Quarter function with the specific quarter logic which is use for every Quarter period.

This is then refreshed into the Power BI Dataflow

And then refreshed into my data set.

All ready to work with. What I decided to do in Power BI was create a table containing those values.

Support End Date (For Slicers) = 
SUMMARIZE ( 'GrowthHubMetrics', 
GrowthHubMetrics[SupportDateKey],
'GrowthHubMetrics'[Support End Quarter], 
'GrowthHubMetrics'[Support End Month], 
'GrowthHubMetrics'[Support End Month No], 
'GrowthHubMetrics'[Support End Year], 
'GrowthHubMetrics'[Support End Date])

The dates in the fact table were then hidden leaving the above Slicer table.

And then I remembered that Date time Measures don’t work without a continuous date table and DateB isn’t continuous.

But then I remembered that I had already put in a lot of work creating all the date time measures on the inactive join to the date table using USERELATIONSHIP so actually all the date time measure are fine.

Would I go back and create a date table for dateA and a date table for DateB so I can have slicers and drill down to the date. Its very possible but that renders the use of USERELATIONSHIP Pretty null and void, unless you still want to slice these visuals against DateA.

Connect the DAX table to the model.

the date Table for Slicers is connected up to Dimdate in the model and we are ready to go

So all the measures are still connected to the date table through the inactive join

But the slicer can be set up over the new DAX table

I’m hoping to get a better way of doing this as we go because it seems that you should be able to do this without having to create another table.

Update

In the end, this became so complex and issue heavy (Drill throughs not working properly etc) I gave up. Created a reference of the date table for Date B. Connected this up. removed the DAX table and changed all my measures to get rid of USERRELATIONSHIP.

So Pages 1 to2 use DateA and Pages 3 and 4 use DATEB.

As a conclusion, if you are wanting to use Drill through and Drill down I think I would recommend role playing dimensions. the complexity of the USERELATIONSHIP with just seems incredibly complex at the moment.

Its a shame because I love the idea of switching to a none active join but in practice its proved unworkable.

If you have any tips on this please let me know

Power BI DAX – CALCULATE AND FILTER and Implicit and Explicit filtering

As ever with DAX, I tend to have to remind myself of the basic every so often, especially when i have bee concentrating on other topics.

We need to remember the following

Model Relationships propagate filters to other tables.

Product can be sold many times. (1 to many)

Have a look at the model in your Power BI desktop file


If you click on the join

You can see Cross Filter Direction (In this case Single) We can Filter the measure within the fact table by, for example, Product Name. But we cant Filter Product Name by, for example Sales amount within the fact table

When you are looking at Measures you basically filter a measure by either an implicit filter or an explicit filter within the DAX.

Confusing? How can the above information not be?

Lets take this a step backwards by Looking at implicit filtering


using the above example we have taken Color from the Product table and Order Quantity (The Metric) from the Fact table

We implicitly Filter Order Quantity by colour. No need to specify anything in DAX

Implicit – Suggested though not directly expressed

CALCULATE

Lets create a measure

Order Quanity of Yellow Products = CALCULATE(SUM(FactInternetSales[OrderQuantity]),DimProduct[Color]=”Yellow”)

So here CALCULATE evaluates the summed value of Order Quantity, against the explicit filter of color = Yellow. So in the above visual, it ignores the implicit value of the Color filter within the visual.

Here is a nice reminder


If you drag a field into your visual its an implicit filter

If you set the filter within your DAX its an explicit filter and it will override what is in your visual

CALCULATE allows you to calculate a value against a context (The filter modification).

Lets change this slightly

Order Quantity of Yellow Products = CALCULATE(SUM(FactInternetSales[OrderQuantity]),ALL(DimProduct))

Now instead of Colour = “Yellow” We are saying Calculate against all Products

This time, note the total matches the total. this is because we are ignoring the colour context and Getting the total of ALL products

* quick sanity check. The visual isn’t quite right. The totals should be the same on the rows and the total column. This must simply be an issue with the visual I used here.

FILTER

Filter basically returns a table that has been filtered. Lets have a look at a FILTER Function used in very much the Same way as CALCULATE above

TOTAL Order Quantity Yellow colour = CALCULATE(SUM(FactResellerSalesXL_CCI[Order Quantity]),FILTER(DimProduct,DimProduct[Color]="Yellow"))

We are again calculating the sum of Order Quantity, with a Filter of Color = yellow. Lets look at how the visual changes

This time, instead of seeing the total for yellow against every other colour quantity, we only as see the measure against yellow.

the great thing about FILTER is that you can have more than one Filter as an OR

TOTAL Order Quanity Yellow & Black = CALCULATE(SUM(FactResellerSalesXL_CCI[Order Quantity]),FILTER(DimProduct,DimProduct[Color]="Yellow"|| DimProduct[Color]="Black"))

Now we can see Yellow OR Black Quantities

how about if we want to see only yellow products in 2014. FILTER comes in useful for this

TOTAL Order Quanity Yellow & Black = CALCULATE(SUM(FactResellerSalesXL_CCI[Order Quantity]),FILTER(DimProduct,DimProduct[Color]="Yellow"),FILTER(DimDate,DimDate[CalendarYear] = 2014))

This time, Not only are we only seeing Yellow product Quantity but the Quantity sold in 2014

FILTER is obviously slower than CALCULATE so if you are only filtering on one thing, go for CALCULATE.

The Dos and Donts of Power BI relationships and Modelling

Bad Modelling 1: Single flat file table e.g. Salesextract

When people first start out using Power BI as their Analytics platform, there is a tendency to say, lets import all the data in one big flat file, like an Excel worksheet.

This way of working is just not well organised and doesn’t give you a friendly analytics structure.

Avoid Wide Tables

Narrow tables are much better to work with in Power BI. As the data volumes grows it will affect performance and bloat your model and become inefficient. then, when you create measures, things will start getting even more overly complex in the one long and wide table.

Not to mention the point when you have to add another table and create joins. You may be faced with the many to many join because of your wide table.

STAR SCHEMA are the recommended approach to modelling in Power BI

Stars with a few Snowflaked dimensions are also ok.

If you have a flat file wide table its always important to convert to an above data model with narrow dimension tables and a fact table in the middle with all your measures.

Remember, Chaos is a flat file.

Model Relationships propagate filters to other tables.

In this example the ProductID propagates down to the sales table. 1 Product can be sold many times. (1 to many)

With a snowflake you can add another level

CategoryA Propagates down to the Sales Fact table

Deliver the right number of tables with the right relationships in place.

Power BI was designed for the people who never had to think about the design of data warehouses. originally, this self service tool would allow any one with little or no knowledge of best practice to import data from their own sources, excel spreadsheets, databases etc without any knowledge of how they were set up.

This becomes an issue when the recommended Power BI model is the fact and dimension schemas as above.

Understanding OLAP models go a long way to helping you set up Power BI

  • Dimensions Filter and group
  • Facts Summarise measures

Bad Modelling 2: Direct Query your Transactional Database

When you connect up to OLTP and drag in all your tables ( there may be hundreds of them) using Direct Query there are lots of things to consider.

the overall performance depends on the underlying data source

When you have lots of users opening shared reports, lots of visuals are refreshed and queries are sent to the underlying source. This means that the source MUST handle these query loads for all your users AND maintain reasonable performance for those using the OLTP as they enter data.

You are not the most important person in this scenario. The person(s) using the database to add data is the most important person

OLTP is designed for speedy data input. OLAP is designed for speedy retrieval of data for analytics. These are to very different things.

With OLTP, you have row-Store indexes (Clustered Index, Non-Clustered Index) and these are slow for data analysis. They are perfect for OLTP style workloads. Data Warehouse queries, consume a huge amount of data, this is another reason why using OLTP as your direct query data source isn’t the best approach.

Also your Direct Query means you loose a fair amount of DAX functionality time time based DAX calculations, What if Parameters, etc.

I was chatting to someone about this on the forums and they gave me a fantastic analogy

When you connect into a transactional database with Direct Query, its like being in a busy restaurant and getting all the customers to go and get their food from the kitchen.

It slows down the customers because of the layout of the kitchen. They don’t know where anything is, and other customers are also milling around trying to find where their starter is.

the Kitchen staff who are now trying to prepare the food are having to fight for physical space. Look at the pastry chef, trying to work around 10 customers asking where their various desserts are?

So you set up a reporting area. This is where the food gets placed, someone shouts service and a waiter will go and speedily deliver the course to the correct table.

No one needs to go into the kitchen unless they are in food prep. Everything works in the most efficient way.

Model relationships Dos

  • Only 1 ID to One ID. If you have composite keys they need to be merged
  • No recursive Relationships (relationships that go back to the same table. the example always used for this is the managerID in the employer table
  • the Cardinality is 1 to many. 1 to 1. many to one. (Many to Many needs a specific approach in Power BI)
  • Cardinality determines whether it has filter group behavior or summarise behavior
  • There can only be one active path (relationship) Between two tables. All your other paths will be inactive (But you can set up DAX to use them)

In this example OrderDateKey is the active relationship because we use this the most and joins to DateKey

ShipdateKey and DueDateKey also join to DateKey in the date table and are inactive.

DAX Functions for Relationships to help with modelling decisions

RELATED

When creating calculated columns you can only include fields from the same table. Unless you use RELATED

For example, I’m adding the column Colour into the SalesOrderDetail table which has a Many to One join to Products •Colour = RELATED(Products[Colour])

RELATED allows you to use data from the one side in the many side of the join

RELATEDTABLE

RELATEDTABLE Uses data from the Many side of the Join

TotalSales = SUMX(RELATEDTABLE(SalesOrderDetail),SalesOrderDetail[LineTotal])

USERELATIONSHIP

Forces you to use a relationship instead of the active relationship

=CALCULATE(SUM(InternetSales[SalesAmount]), USERELATIONSHIP(InternetSales[DueDate], DateTime[Date]))

CROSSFILTER

Modifies the filter direction Disables propagation. You can actually do this in the model by changing the filter to both directions instead of single. OR you can do it for a specific DAX query using CROSSFILTER

TREATAS

Create virtual relationships between tables

CALCULATE([Total Budget],

TREATAS(VALUES(Dates[Year]),’Unconnected Budged Data'[Year]))

Our Unconnected Budgeted Data is in Year only and its not joined to our main model.

Here we connect up to Year in Date. then we can create a visal with Date from the Date dimension. Total sales from our connected data which is at daily level and Total Budget from our unconnected budgeted data at a different level of granularity.

PATH

Naturalise a recursive relationship with the PATH function

Getting your model right and understanding your data sources is the most important thing to get right with Power BI. make sure you don’t have lots of headaches six months into your project. Its better to spend the time now, than having to start again later.

Introduction to DAX

I have gone through a fair few training courses on DAX now because its an important part of Microsoft Analytics. I thought It would be nice to include all my notes from one of the first DAX courses I attempted.

You don’t have to be an expert on DAX to get started with Power BI. You can start with a few of the basics and get lots of help along the way in the forums.

DAX (Data Analysis Expressions)  is an expression language for slicing and dicing analytical data

Where is DAX Used?

  • Power BI
  • Power Pivot (Excel)           
  • SSAS Tabular Mode (You can’t use it in Multidimensional mode)
  • Azure Analysis Services (This is only available in Tabular Mode)

DAX IS USED HEAVILY IN THE MICROSOFT BI STACK

Where Does DAX Shine?

  • Aggregations and filtering. Its optimized for both

What you need

  • An easy way of defining key metrics.
  • You need to be able to slice and dice
  • You need to be able to do historical analysis
  • DAX is an easy way of defining key metrics.

What is DAX not good at?

  • Operational Reporting – Detail heavy used for day to day operating. Line by Line Reports
  • Wide tables – Tables with a lot of columns
  • Many to Many relationships. There are ways around this, but it can be difficult to resolve

Its worth noting that this visual was provided over a year ago when Azure Analysis Services was the only way of creating the back end centralised model. we are now at December 2020 and Power BI Has moved on to become not only Self Service, but a really great way to implement a Standard centralised model for Enterprise reporting. You do this by creating data flows in Service, which establishes your transformations in one specific area for reuse. Then over this, data sets can be created that can be reused by other developers once you have promoted them.

Thinking in columns not rows

  • OLTP – Online Transactional Processing. (Normalised schemas for frequent updates)
  • OLAP – Online analytical Processing (Fact and Dimensions. Star Schemas etc.)
  • Single column aggregations – Sales by area etc. (OLAP)
  • Large number of rows – Sales by year, we have 10 years of data. Lots of rows (OLAP)
  • Repeated values – Data is flattened unlike in an OLTP
  • Need to quickly apply filters By Area, Postcode, Year etc.

We may only want 3 columns in a table that has 50 columns. We just want the slice of those 3 columns and we don’t want to read the other columns

For this we store data as columns Not Rows

Vertipaq

Also known as xVelocity. It is the engine used to store data as columns. The data is imported into Power BI data center as Columns not rows

Direct Query

If you set up your connection as a direct Query, Power BI has to translates DAX formulas into relational SQL queries so you lose a lot of the functionality like Time DAX because its too complex for SQL.

Compression and Encoding

This is how Power BI Compresses your data. Each column is compressed separately

  • Value Encoding
  • Dictionary Encoding
  • Run-Length Encoding

Value Encoding

Let’s take the lowest value in the range

and then store everything else as the delta (Difference) of that value C. 

This makes the Code shorter in length, compressing the data

Dictionary Encoding

Actual Column

The value gets assigned a number which is then used in place of the actual data item

This takes up far less space.

Run-length encoding

The data is sorted

This just keeps the colour and the number of repeats. Repeating values creates excellent compression. Unique values, not so much.

Adding Business Logic with Calculated Columns and Measures

The following examples have been set up using the old faithful Adventureworks data base.

Im initially bringing through two tables

  • Sales.SalesOrderDetail
  • Production.Product

and renaming them to Products and SalesOrderDetail

We are now ready to try some examples

Calculated Columns

Where should you be creating calculated columns? In Power Query using M Language or in DAX.

the speedy answer is, if you can, always create them in Power Query. However, you do need to understand how the calculated column works in DAX in order to understand measures

  • Expands a table by adding a new column.
  • Operational, Detailed information. Can only look at the specific row
  • It’s an expression that produces a column. Computes at the time of data refresh and is stored within the table
  • Limited by a row context. Price-Discount. They don’t take advantage of the columnar database

In Power BI we are going to set up a Line total in SalesOrderDetail because you need where possible to put the new column in the right table with the data that is being used to create it

Click on … Against the table under fields in power BI and New Column

LineTotal = SalesOrderDetail[UnitPrice] * SalesOrderDetail[OrderQty] * (1- SalesOrderDetail[UnitPriceDiscount])

It calculates the values for each row

The new column now sits in the model with the other fields (Go to Data Tab)

And you can use this like any other metric. In this table we are filtering by product Colour. Line Total has already been calculated within the row context.

You can now use Line Total in a table visualization because we can treat this column like any other column and its now being set in the implicit filter context which in this case is Color.

Implicit context filters are when you add a description column which aggregates your metric within the visual. Or if you add a filter to the visual. We will look at Explicit filters later.

Related

When you create a column you can only create it from data in the same table unless you add RELATED or RELATEDTABLE into your DAX

Adding Color into the SalesOrderDetailTable from Products

1 Product can be sold Many times.

Colour = RELATED(Products[Color])

I have added data from another table into a table. RELATED allows you to pull data across from another related table. Note that in the previous DAX I only used data from one table, so I didn’t need to use RELATED.

RELATED is for the One to Many Relationship. Using data from the one side in the many side. Colour (from 1) now sits in salesOrderDetail (M)

RELATEDTABLE

RELATEDTABLE uses the data from the many in the one side

We are in Product

TotalSales = SUMX(RELATEDTABLE(SalesOrderDetail),SalesOrderDetail[LineTotal])

The SUMX is an Iterator. Takes a table and an expression to evaluate. This means We are going through each row in the Products table and its running the evaluation. It’s using the related table to go through all of the LineTotal’s that we created in SalesOrderDetail

So for example

Row for Product ID 707. Sum up the Line total in salesOrder Detail where ProductID 707 and add the line total into Product

Next iteration. Row for ProductID 708. Sum up the Line total in salesOrder Detail where ProductID 708 and add the line total into Product

And repeat until you have iterated through the entire table for each row

Again, a quick rule of thumb is, if you are creating a calculated column using data from the same table, do it in Power Query Editor. If you have to use data from other tables, use DAX.

Your data will be compressed like any other field if its been created with Power Query Editor but it wont be if its a calculated column because it happens after compression.

Measures

  • Summarises all the data into a single value. Not stored on the table.
  • Analytical. Takes a column and brings back a summary
  • They are computed at runtime so stored temporarily
  • Every time you open a report, your Measures are computed
  • Limited by a filter context (Rather than a row context)
  • A measure looks at the data minus what has been filtered out by the user at that time.
  • They are more loosely associated with a table, so you don’t need to RELATE tables

Again, In Power BI on the SalesOrderDetail Table:  Create Measure

Minimum Price = Min(SalesOrderDetail[UnitPrice])

You can’t add your measure to a slicer or use as a filter because its created after the filters have been set

Implicit Measure – Underneath the hood is a measure for anything that you aggregate that is created by the DAX engine. For example your Total column. Here we are implicitly filtering UnitPrice by Colour to get the minimum value.

If you go into the Data tab, you wont see this measure because it only exists after you create your visual.

Filter data Using Calculate

Implicit

Applied by the user or the layout of the report. The visuals implicitly filter the data. When you drill up and down on a visual for example.

Explicit

Coded into the DAX Expression. Explicit filtering overrides Implicit filtering

Calculate

Calculate changes the current evaluation context

In Power BI set up a Table containing Products Colour and Unit Price (Set to Average rather than Sum)

BlackAveragePrice = CALCULATE(AVERAGE(SalesOrderDetail[UnitPrice]),Products[Color]=”Black”)

This Calculates the Sales Total Only on the colour Black. This overrides the implicit filters of colour. It changes the Current Evaluation context, e.g. White or yellow

Add a filter and notice that this is also ignored because the Explicit context filter ignores the Implicit Context Filter of red

Next create a measure

IsColorFiltered = ISFILTERED(Products[Color])

ISFILTERED Return value. TRUE when columnName is being filtered directly. 

Colour is filtering each of the line items but is not filtered for the total because it’s the average of everything

Now the Grand total is also filtered by Grey once we have filtered on grey using the slicer so both are now true

Calculate is designed to be fast. For more complicated examples we need to use Filter.

Calculate only lets you compare a single column to a fixed value

Filter

CALCULATE( SUM ( [Total] ), FILTER (Sales, QTY * Price < 100 ))

Filter takes in two values. The Quantity and Price and then applies the filter of less than 100

Now, We want to see The Average Unit Price Discount Where Order Quantity * Unit Price * Unit Price Discount Any time the total discount is greater than 10 dollars, we want that to apply to the average

As an example, In Power BI we need another table of Color and UnitPriceDiscount (Set as Average)

Create a measure in SalesOrderDetail

DAX is a functional language, its created from functions like SUMX etc.

DAX is executed from the innermost parameter so don’t create all your DAX on one line.

AverageDiscountGreaterthanTen =
CALCULATE(AVERAGE
(SalesOrderDetail[UnitPriceDiscount]),
FILTER(SalesOrderDetail,
SalesOrderDetail[OrderQty] *
SalesOrderDetail[UnitPrice] *
SalesOrderDetail[UnitPriceDiscount]>10
)
)

Filter takes in two parameters. We need to tell it the table and then the expression we want.

Filter is creating a table of values to work with because it is a table function. Its always embedded in other functions, in this case CALCULATE,

You cant have measures inside a filter.

to start to understand this I’m going to add details for a specific order and product. Im going in at the lowest level of granularity

I wanted to see exactly what was going on in the filter so I created a calculated column working on the row context

Qty*Price*Discount = SalesOrderDetail[OrderQty]*SalesOrderDetail[UnitPrice]*SalesOrderDetail[UnitPriceDiscount]

So for Full Finger Gloves L.44 were bought, Around 17 dollars each with a discount of 15%

the Average of UnitPriceDiscount is the same as the Summed UnitPriceDiscount because the products are all identical with the same discount applied

44 $17 gloves would bring in $752. Applying the discount of 15% (0.15) shows is that £112.83 has been discounted from the total price.

This is above 10 so we can simply display the average Unit Price

ALL

Historical Sales = CALCULATE (SUM ([Total]), ALL (‘Date’) )

ALL clears any filters in the entire date table

For this example, we want the Sum of Order Quantity for the specific context filter divided by (We now need calculate because we are manipulating filters) Sum of all Order Quantities.

ALL takes in just the product table because we want to undo any filters from the Product table (New Measure in the Products Table)

% of total sales = SUM(SalesOrderDetail[OrderQty]) / CALCULATE(SUM(SalesOrderDetail[OrderQty]), ALL(Products))

So for Black context filter the Order Quantity is 81937 / the total (We have removed the expression filter for the entire product table) of 274914 so we can see that Black is 29.80% or the order total.

With a slicer on the report. We choose One Product and even though its the only row in the visual Total Sales shows -.13% instead of 100% which is just what we want.

We still have 18% of Total Sales instead of it showing 100% because that’s the only row left in the visual. Perfect. This is what would have happened without ALL

% of total sales = SUM(SalesOrderDetail[OrderQty]) / CALCULATE(SUM(SalesOrderDetail[OrderQty]))

Looping over the Data with Iterators

Allows you to use multiple columns in an aggregation

What is an iterator?

Power BI’s DAX engine is optimized for working with columns not rows.

Iterators can process on a row by row manner. Its less performant because its working with rows. 

Takes in a table parameter (What is it looping through)

Then takes in an expression (For each row, evaluate this expression)

We have used something before that takes in the table and then an expression. The Filter function. The Filter function is an iterator.

Many of them end in X. SUMX MAXX CONCTENATEX etc

Average Gross Sales =

AVERAGEX(Category, [Quantity] * [Price] )

You could create Quantity* Price as a calculated column so each row gets this valuated first, and then average this column separately. The iterator does both in one formula

Add a new Measure into SalesOrderDetail and start typing in the following (Type it in so you bring up the intellisense)

AverageGrossSales = AVERAGE(SalesOrderDetail[OrderQty] * Uni

If you type in the above you notice intellisense stops working when you get to Unit price. This doesn’t work. The average function only accepts a single column.

This is where you could create this part as a calculated column but we want everything in one measure

AverageGrossSales with Discount =

AVERAGEX(SalesOrderDetail,

SalesOrderDetail[OrderQty] *SalesOrderDetail[UnitPrice]*

(1 – SalesOrderDetail[UnitPriceDiscount])

)

Also, you cant use AVERAGE because Average only accepts one value. AVERAGEX can be used in a measure because it iterates through each row and then aggregates

CONCATENATEX example

Add the following measure to Colours

Colours Incorrect = CONCATENATEX(Products, Products[Color],” “)

Its only called Incorrect because this is the wrong way to do it

This measure should show us what colours we have filtered on but as you can see below, now I have added a Card. We get the filtered colours for every row and we don’t want that. The Concatenate is iterating through every row and adding every colour together

Colours = CONCATENATEX(DISTINCT(Products[Color]),Products[Color],” “)

If you change the Table part of the expression to Distinct Colour, this gives you the distinct colour filtered on

This time we added a Distinct and if we choose more than one colour a space will be between each colour.

Evaluation Context

Defines what values an expression can see

Nesting Evaluations

If the above is a little confusing (and it will be) here it is again

RANKX is an iterator

Its iterating through products so we are going to start at the first product. Tulip

Its passing everything from Internet Sales related to Tulips into basically a temporary table

We are passing this information into the SUM of X and we repeat for every Product

And now the RANKX has information to work with

Analysing History with Time intelligence

We need to be able to compare periods. We can make use of DAX time intelligence

  • Single date functions – Return a single date First date
  • Date table functions – Returns a table as the output. Year to date. Takes in a date and takes in a column filtered year to date
  • Date aggregation functions Total Year to Date

Create a Date Table

Why can’t I use the date table in sales?  Dates need to be contiguous

How to 1:

  • Get Data > Blank Query
  • Next click on Advanced Editor

let

    Source = #date(2011,1,1),

    Dates = List.Dates(Source,3650,#duration(1,0,0,0))

in

    Dates

Create the source date which is the 1st Jan 2011. Then take the source and create Dates. I’m creating 3650 rows with a duration of 1 day between each

Then click on ‘To Table’ to convert to table. Then I renamed the column to Date,

I can then add more dates using M query by duplicating the date and changing to Year, Month and Day. Make sure Date is a Date

How to 2: You can also make this table using DAX

  • Modelling tab > New Table

DatesDaxCALENDAR = CALENDAR(“01/01/2011″,”01/01/2021”)

This is your starting point, but we can amend this to include other date columns

DatesDaxCALENDAR = ADDCOLUMNS(CALENDAR(“01/01/2011″,”01/01/2021″),”Year”,YEAR([Date]),”Month”,MONTH([Date]),”Day”,DAY([Date]))

If you used CALENDARAUTO it looks at the date that you used and it automatically calculates the start and end based on your data set

Connect your new date table up to the Sales table

For the Power BI exercise, I am going to use the Date table created with M.

Dates as Surrogate Keys

Sometimes when you work with data warehouses you have an integer as a date for the surrogate key. 20190206 for example

If you have to join to a surrogate key.

  • Modelling > Mark as Date table. Say which is the date and now this will work. The only reason to mark as date table is if you are joining to a surrogate key.

If you move everything to an Analysis Service, you MUST have a date table marked as date

Year over Year and Year to Date Analysis

First, Add Sales Header to your Power BI Report from the Database

  • Edit Queries > Recent Source > Choose the Adventureworks Source > Sales.SalesOrderHeader > Import
  • Rename to Sales Header and close and apply
  • Go into relationships and set as follows

Sales Header OrderDate to Date in DateM (Ensuring they are both dates)

Disconnect the other Date table. This is for reference only

Now create this year to date measure in the DateM table which uses date from the date table

YTD DatesM = TOTALYTD(SUM(‘Sales Header’[TotalDue]),DatesM[Date].Date)

As you can see this works. Year to date resets when we get to the next year.

Get the Sales from the Previous Year (Added measure to the Sales Header Table)

Prev Year = CALCULATE(SUM(‘Sales Header'[TotalDue]),PREVIOUSYEAR(DatesM[Date]))

YoY Growth = ([Prev Year] – Sum(‘Sales Header'[TotalDue])) / Sum(‘Sales Header'[TotalDue])

So take the current total Due from the Previous years Total Due. Then divide by the total Due

This is only a starting point really but gives you a good idea of some basic DAX queries, How to write it and why.

Creating an open Account Measure using DAX

Thank goodness for the amazing people on the Power BI Forums helping with this. I have the solution but what I wanted to do was to create a post to explain the why

The Problem

I have accounts and events occurring on each account. What I want to know is the time period they are open

Just for this example I have taken One account with 11 activities

If you run the visual against start date all you get is a measure againt the start time period.

We are more interested in creating a measure to show how long the activity went on for.

For this example I am going to choose Jan 28 2017 to work with to continue because the end date for this is March 21st 2017 so this activity should span 3 months

The Date Dimension

To do this you must have a date dimension. For the purposes of this example I’m going to create one in DAX (Create table) Using the following:

Notice that I have used Start and End Dates to set the min and Max years. You should already have a date dimension to work with though.

Set up the relationship between the Fact table and Date dim

Start date is the Active join and End Date is the Inactive Join

You could create a role playing dimension for the End date and have that as an active join but in this case I’m going to go for the above design.

Account Start flag (New Measure)

  • CALCULATE – You can modify your expression using a filter
  • ALLSELECTED – If you have a slicer set to Year (Date) , ALLSELECTED takes the ‘Context’ from the slicer Like 2018.
  • Date is less than or equal to the Max Date in the Filter Context

Account End Flag (New Measure)

  • ENDOFMONTH – Returns the last date of the month in the current context for the specified column of dates.
  • The date at the end of the month is less than the Max date in the Filter Context
  • USERELATIONSHIP Because End Date is the inactive relationship we need to specify that this relationship is used, Not Start Date

Add an Open Measure

To understand the results we need to try and understand how the new measures work

Jan

  • Start Flag =        1 because 18th Jan is less than Jan 31st
  • End Flag  =        NULL because March 21st (Reset to March 31st) is NOT Less than Jan 31st
  • Open =  1- null = 1

Feb

  • Start Flag =        1 because 18th Jan is less than Feb 28th
  • End Flag  =        NULL because March 21st (Reset to March 31st) is NOT Less than Feb 28th
  • Open =  1- null = 1

Mar

  • Start Flag =        1 because 18th Jan is less than Mar 31st
  • End Flag  =        NULL because March 21st (Reset to March 31st) is NOT Less than March 31st
  • Open =  1- null = 1

Apr

  • Start Flag =        1 because 18th Jan is less than April 30th
  • End Flag  =        1 because March 21st (Reset to March 31st) is Less than April 31st
  • Open =  1- 1 = 0

There are usually lots of ways to answer one question so this is one of many but the Open Flag now allows me to pull up all my activities and show in a Matrix where the are open.

Create a visual Axis Category from a measure (Create Table with DAX)

Take this data as an example

  • ID      Total
  • 99                   1
  • 99                   1
  • 101                  1
  • 101                  1
  • 101                  1
  • 333                  1
  • 333                  1
  • 333                  1
  • 333                  1
  • 635                  1
  • 635                  1
  • 635                  1
  • 635                  1

If you do the usual measure SUM(Total) you would get 

  • ID       Total
  • 99                   2
  • 101                  3
  • 333                  4
  • 635                  4

But I don’t want the visual to show the above. I want the number of occurences on the axis and then the number of IDs within that occurence as the value

You cant use a measure on the axis, they can only be used as values. How you you solve this problem?

Go to Modelling and New Table

SummaryTable = 
SUMMARIZE (
    'fact Fact',
    'fact Fact'[ID],
    "Frequency", COUNT ( 'fact Fact'[ID])) 

SUMMARIZE Returns a summary table for the requested totals over a set of groups. In this case, Our group is the IDs.

For the Summary we are simply counting our IDs as the Frequency

Next, Add a new Table

PotentialFreqs = GENERATESERIES ( MIN (SummaryTable[Frequency] ),max('SummaryTable'[Frequency]), 1 )

This creates a list of Frequencies starting at 1 and Ending at our Max Frequency. there is is issue here. we have a lot of NULL Values that creates a large Frequency number that we don’t want in our series of frequencies. this means we have values from 1 to 98 and there should only be values 1 to 4. How do we remove these values?

SummaryTable = 
SUMMARIZE (    
        'fact Fact',    
        'fact Fact'[ID],    
        "Frequency", IF(
                       'fact Fact'[ID] 
                       <>"0",COUNT 
                       ('fact Fact'[ID]))  
          )

Note the addition of the IF statement. If the ID is not zero then count the IDs. Else do nothing

Create relationships

In order to use the new tables we need to create relationships between them

the Summary table is joined to the fact table by the ID (Business key) within the fact table

PotentialFreqs is joined to the SummaryTable via Value to Frequency

the first visual to try this out is a stacked bar chart. Value from PotentialFreqs table is in the Axis and Frequency from the Summary Table is used as the Value.

So we can see that 33 of our IDs have 3 records in the table.

the PotentialFreqs table is really useful if you want to plot your data in a histogram as it gives you a continuous type to plot against rather than categorical

In conclusion, if you need to use a measure on an Axis rather than a value, Create a summary table and then join to the main tables in your model

Update

Unfortunatly the above solution doesnt quite work because I want to base my metrics against each year. sometimes the IDs are split into Years. I will be looking at this later.

Power BI – Dax – RANKX how to use Rank X with 2 values, Filters and Slicers in a Matrix

I have been struggling to implement RANKX into my Reports. I have 3 tables

The date table is used for lots of other reports and goes up to the end of 2019

Our facts for this particular report only go up to March

We want to rank the Average Fact against the Group by Month.  The report will have a slicer on Year

We also don’t want to rank against the full set of groups. Only a subset of them so I am applying a filter against group for three of the groups

The original Ranking DAX used

Rank = RANKX(CROSSJOIN(ALL(‘dim Date'[date].[Month]),ALL(‘dim'[Group])), [Avg Fact],,DESC)

RANKX – RANKX is an Iterator. It takes a table and an expression to evaluate. RANKX looks at each row in the table and running its evaluation which is to return the ranking for each row in the table argument. RANKX creates a row context because it’s an iterator

Row Context –  Calculated at processing time instead of at run time. This is a calculated column rather than a measure because the data is set on each row. A good example of a Row context calculation is, Is this value greater than 100.If yes Set to True. If no set to False. This is applied on each row

CROSSJOIN – Cross join allows you to recreate a table from all Tables and Columns in the cross join

ALL – Returns all the rows in the table, ignoring any applied filters

So essentially You are ranking against ALL months in the date and All groups within the other dimension table. We are ranking the Avg Fact measure

Now this works to some extent if you add in the table with the date filter set but no filter on the groups you want. Looking at January for example

However Later on in the data set the null values are set as

AND when I apply the filter of only having certain groupings in the table the RANKX fails even further

Note that the RANKX is still ranking every single value, even though we have applied a filter for the Group

This is because of the ALL. The Cross Join takes ALL month Values and ALL Group Values, and dismisses the fact that we have applied a filter

You need ALL because if you just ranked against the 1 value that the row context is on, all ranking would be 1.

Rank = RANKX(CROSSJOIN(ALLSELECTED(‘dim Date'[date].[Month]),ALLSELECTED(‘dim'[Group])),  [Avg Fact],,DESC)

ALLSELECTED is different to ALL because ALL calculates everything ignoring filters. ALLSelected Takes into account the filter on the visual. An important part of the solution is that we are slicing on Year and we are filtering for specific groups. As a consequence we need to use ALLSELECTED

Remember we are ranking the measure against each Group. What is the best Group this Month?

Next stop, ensuring we don’t rank against null values

Rank =

IF(

    NOT ISBLANK( ‘fact'[Avg Fact]),

    RANKX(CROSSJOIN(ALLSELECTED(‘dim Date'[date].[Month]),ALLSELECTED(‘dim'[Group])),

    [Avg Fact],,DESC)

)

We have wrapped up the original RANKX into An IF block

NOT ISBLANK (If our measure its not blank then assign the RANKX function

This is great, Its working in the table because we can see in February ‘Age’ is  top ranked and ‘Digi’ is bottom ranked.

Changing the DAX Query issues from table to matrix

However , if the data is to be displayed in a matrix we may not want to rank in this way at all

As you can see, in January we rank against each group. But what we could require is to Rank the best month for a group.

So we have created a rank to rank each group in a Month. NOT to rank the best month for a group.

As yet I haven’t come up with a solution to this one so the mystery continues but once I have a solution I will certainly create a post about it.

If you have a solution please let me know

Power BI Time based Measure not working when slicer (Year) applied

SAMEPERIODLASTYEAR DAX not working with a year filter

I wanted to create a KPI showing year to Date and Last year to date.

So I have Measures:

YTD Complaints = TOTALYTD(‘fact Fact'[Complaints],’dim Date'[date].[Date])

LY YTD Complaints = CALCULATE([YTD Complaints],SAMEPERIODLASTYEAR(‘dim Date'[date].[Date])) 

I have a slicer on year so i can select the year to look at. Without the year selected its fine (Ive put the data into a table simply to have a look at)

As you can see the last year information is shown in the column 

if I choose 2019 to look at i want to see 2019 with Last year against it

See how the Last year value disappears. I was expecting that creating this measure would allow you to see Last years metric even with a year slicer set

The answer to this is actually quite simple

Here I have used Year which is an actual data column in my date table to slice the data with and its not working

This time I have used year from my date hierarchy and its worked

Which to me meas that if you are using these kind of time based measures, if you want to slice by date you HAVE to use the time period from the date hierarchy in the slicer rather than another data item (As in the example. I used the Year column created in the date dimension)