Power BI. Using flags in a junk dimension

We currently have a project where the metrics are actually flags to count whether a record is true or false rather than business metrics like Amount, SaleAmount etc

  • Is something completed? 1 or 0
  • Is something in Development? 1 or 0
  • Is something Out of Scope ? 1 or 0

Etc.

Now, if you left these in the fact table you could sum them to create a measure. But you are left with thousands of rows of just 1s and 0s. against all the keys in the fact table.

Also, they aren’t really metrics. They are true and false flags and as a consequence should not be in a fact table.

The above solutions is not what Kimball would recommend. Kimball recommends a Junk Dimension

Definition from Kimball: — A junk dimension is the combination of several row-level cardinality flags and attributes into a single dimension table rather than modeling them as a separate dimension

It should also be noted that a junk dimension can contain flags and other attributes that don’t quite fit anywhere else.

The star schema looks like this (This is just an example STAR)

Flags are split between two areas D1 and D2 which has also been added to the junk dimension as its basically just a title to split the metrics into two rows in a table.

These two areas are two completely separate low level fact tables, with different metrics and dimensions. think of for example ‘Human resources’ and Marketing’.

we have them here because we have a high level granularity STAR Schema comparing some of the similar flags across the two departments.

This could have been added as a separate dimension but as its just one data item it seemed more sensible to add it here.

So instead of having them repeated over and over again in Fact. We just have every option available set in the junk dimension once. So every single record where F1 is true for Area D1 goes to Key 0.

Summing the flags

The flags are Bool. True false columns so we cant simply sum them up. If we were to do that we would need to convert them to int

Flag int = CONVERT([Flag],INTEGER)

Once converted we could sum them.

SUM([Flag Int]

The junk dimension is used against multiple fact tables and here is the first issue.

If we create DAX this way, there is no mention of the central fact table. Therefore if you use it it just counts all the true flags in the dimension, completely separately from the fact table.

We need that join to the fact table in the DAX

So we dont need to Convert the Flag to an int to use it in a measure. We can do the following:

Measure is true = CALCULATE(COUNT(‘Fact1′[JunkKey]),’Dim Junk'[Flag]=TRUE()|| ‘Dim Junk'[Area] = “D1” )

CALCULATE

The CALCULATE function evaluates the Count of the Junk keys that are already integer within the fact table. This is important because we have now created a reference to one of the fact tables.

And we are counting where the Flag is equal to True in the Dim

||

Allows us to add another and. In this case we are only looking at the D1 area.

We can add these logically into the separate fact tables

Measure is true = CALCULATE(COUNT(‘Fact2′[JunkKey]),’Dim Junk'[Flag]=TRUE()|| ‘Dim Junk'[Area] = “D1” )

And now we have a fact table with measures we can use for reporting. And we have made sure out Power BI data model is as small as it can be in memory.

Power BI Concatenating two measures together under one column header for a table display

Another requirement came through for a table that showed the current month figures against last months figures,

However, to save space the two values were in the same column, last month contained in a ()

So part one was to create the date time intelligence for last month,

And then to attempt to concatenate the two figures just for the table

Time intelligence to see this time last month

There are various ways of doing this

Prev Month At Risk = CALCULATE([At Risk],PARALLELPERIOD('Dim'[Date], -1,MONTH))

Here we use CALCULATE to evaluate the SUM of At Risk (created in a base measure already) in a modified filter context. Which is PARALLELPERIOD using the date from the Date Dim.

PARALLELPERIOD takes the current set of dates (For us its month and year) and shifts the first and last date in the period specified a number of intervals. -1 takes us back a month.

This is the chosen method but you can also use

CALCULATE([At Risk],DATEADD(‘DimDate'[Date],-1,MONTH))
DATEADD returns a table that contains dates shifted forwards or backwards in time the number of intervals from the date in the current context. Again we are sing Month and Year in the current Context.

Returns a table that contains a column of dates, shifted either forward or backward in time by the specified number of intervals from the dates in the current context

PREVIOUSMONTH('Dim Report Snapshot Date'[Date]))

PREVIOUSMONTH returns all dates from the previous month using the first date in your context. As we are using month, our first date is 0103 so it goes back to 0102

CALCULATE([Risk],SAMEPERIODLASTYEAR('DimDate'[Date])))

Returns a table (Its a table expression) Returns a single column of date time values from the same period in your current context from last year, SAMEPERIODLAST Year can only go back a year.

Concatenate current month and Last Month into one single value

This is for a table only. Not to be used for Proper analytics.

So first we need to convert the number into a string.

And then we can concatenate. The following was used

At Risk =
VAR varMeasure = CONVERT([At Risk],STRING)

VAR varPrev = CONVERT([At Risk],STRING)

VAR varConcata = CONCATENATE(varMeasure,”(“)

VAR varConcatb = CONCATENATE(varPrev,”)”)

RETURN

CONCATENATE(varConcata,varConcatb)

There will be lots of other ways to do this but I decided on the following
Creating Variables to hold:

  • The measure converted to a string
  • The previous (Month) measure converted to a string
  • CONCATENATE (You can only concatenate two things using this method) so The measure was concatenated with (
  • CONCATENATE, Doing the second part of the concatenation. The previous Month String value with )
  • Finally we return a Concatenation of a and B so make a string value we can use for the table.

To CONCATENATE more that two columns you can use

Column D = [column A] & " - " & [column B] & " - " & [column B]

So were we have 5 measures and a total. To view them in the table under one column header we have concatenated them together.

Power BI DAX – CALCULATE AND FILTER and Implicit and Explicit filtering

As ever with DAX, I tend to have to remind myself of the basic every so often, especially when i have bee concentrating on other topics.

We need to remember the following

Model Relationships propagate filters to other tables.

Product can be sold many times. (1 to many)

Have a look at the model in your Power BI desktop file


If you click on the join

You can see Cross Filter Direction (In this case Single) We can Filter the measure within the fact table by, for example, Product Name. But we cant Filter Product Name by, for example Sales amount within the fact table

When you are looking at Measures you basically filter a measure by either an implicit filter or an explicit filter within the DAX.

Confusing? How can the above information not be?

Lets take this a step backwards by Looking at implicit filtering


using the above example we have taken Color from the Product table and Order Quantity (The Metric) from the Fact table

We implicitly Filter Order Quantity by colour. No need to specify anything in DAX

Implicit – Suggested though not directly expressed

CALCULATE

Lets create a measure

Order Quanity of Yellow Products = CALCULATE(SUM(FactInternetSales[OrderQuantity]),DimProduct[Color]=”Yellow”)

So here CALCULATE evaluates the summed value of Order Quantity, against the explicit filter of color = Yellow. So in the above visual, it ignores the implicit value of the Color filter within the visual.

Here is a nice reminder


If you drag a field into your visual its an implicit filter

If you set the filter within your DAX its an explicit filter and it will override what is in your visual

CALCULATE allows you to calculate a value against a context (The filter modification).

Lets change this slightly

Order Quantity of Yellow Products = CALCULATE(SUM(FactInternetSales[OrderQuantity]),ALL(DimProduct))

Now instead of Colour = “Yellow” We are saying Calculate against all Products

This time, note the total matches the total. this is because we are ignoring the colour context and Getting the total of ALL products

* quick sanity check. The visual isn’t quite right. The totals should be the same on the rows and the total column. This must simply be an issue with the visual I used here.

FILTER

Filter basically returns a table that has been filtered. Lets have a look at a FILTER Function used in very much the Same way as CALCULATE above

TOTAL Order Quantity Yellow colour = CALCULATE(SUM(FactResellerSalesXL_CCI[Order Quantity]),FILTER(DimProduct,DimProduct[Color]="Yellow"))

We are again calculating the sum of Order Quantity, with a Filter of Color = yellow. Lets look at how the visual changes

This time, instead of seeing the total for yellow against every other colour quantity, we only as see the measure against yellow.

the great thing about FILTER is that you can have more than one Filter as an OR

TOTAL Order Quanity Yellow & Black = CALCULATE(SUM(FactResellerSalesXL_CCI[Order Quantity]),FILTER(DimProduct,DimProduct[Color]="Yellow"|| DimProduct[Color]="Black"))

Now we can see Yellow OR Black Quantities

how about if we want to see only yellow products in 2014. FILTER comes in useful for this

TOTAL Order Quanity Yellow & Black = CALCULATE(SUM(FactResellerSalesXL_CCI[Order Quantity]),FILTER(DimProduct,DimProduct[Color]="Yellow"),FILTER(DimDate,DimDate[CalendarYear] = 2014))

This time, Not only are we only seeing Yellow product Quantity but the Quantity sold in 2014

FILTER is obviously slower than CALCULATE so if you are only filtering on one thing, go for CALCULATE.

Design a site like this with WordPress.com
Get started