We currently have a project where the metrics are actually flags to count whether a record is true or false rather than business metrics like Amount, SaleAmount etc
- Is something completed? 1 or 0
- Is something in Development? 1 or 0
- Is something Out of Scope ? 1 or 0
Now, if you left these in the fact table you could sum them to create a measure. But you are left with thousands of rows of just 1s and 0s. against all the keys in the fact table.
Also, they aren’t really metrics. They are true and false flags and as a consequence should not be in a fact table.
The above solutions is not what Kimball would recommend. Kimball recommends a Junk Dimension
Definition from Kimball: — A junk dimension is the combination of several row-level cardinality flags and attributes into a single dimension table rather than modeling them as a separate dimension
It should also be noted that a junk dimension can contain flags and other attributes that don’t quite fit anywhere else.
The star schema looks like this (This is just an example STAR)
Flags are split between two areas D1 and D2 which has also been added to the junk dimension as its basically just a title to split the metrics into two rows in a table.
These two areas are two completely separate low level fact tables, with different metrics and dimensions. think of for example ‘Human resources’ and Marketing’.
we have them here because we have a high level granularity STAR Schema comparing some of the similar flags across the two departments.
This could have been added as a separate dimension but as its just one data item it seemed more sensible to add it here.
So instead of having them repeated over and over again in Fact. We just have every option available set in the junk dimension once. So every single record where F1 is true for Area D1 goes to Key 0.
Summing the flags
The flags are Bool. True false columns so we cant simply sum them up. If we were to do that we would need to convert them to int
Flag int = CONVERT([Flag],INTEGER)
Once converted we could sum them.
The junk dimension is used against multiple fact tables and here is the first issue.
If we create DAX this way, there is no mention of the central fact table. Therefore if you use it it just counts all the true flags in the dimension, completely separately from the fact table.
We need that join to the fact table in the DAX
So we dont need to Convert the Flag to an int to use it in a measure. We can do the following:
Measure is true = CALCULATE(COUNT(‘Fact1′[JunkKey]),’Dim Junk'[Flag]=TRUE()|| ‘Dim Junk'[Area] = “D1” )
The CALCULATE function evaluates the Count of the Junk keys that are already integer within the fact table. This is important because we have now created a reference to one of the fact tables.
And we are counting where the Flag is equal to True in the Dim
Allows us to add another and. In this case we are only looking at the D1 area.
We can add these logically into the separate fact tables
Measure is true = CALCULATE(COUNT(‘Fact2′[JunkKey]),’Dim Junk'[Flag]=TRUE()|| ‘Dim Junk'[Area] = “D1” )
And now we have a fact table with measures we can use for reporting. And we have made sure out Power BI data model is as small as it can be in memory.
One Reply to “Power BI. Using flags in a junk dimension”
LikeLiked by 1 person