I attended the Power BI conference from http://powerbiconference.com/ and got so many great ideas on the 16th June
One was from the session related to DAX and Power BI Models. We have already spoken about Steering clear from flat files. Power BI prefers the STAR Schema model comprising of a fact table and Dimensions providing all the filters for the metrics
I however, do seem to use snowflake schemas and it seems that even these aren’t recommended for Power BI.
So i wanted to have a look at this and see why I do it and what it actually means for Power BI. Also, how can I change my thinking on Snowflake to STAR
This is very similar to something I am working on at the moment. We are looking at Events and these events consist of Agenda items. Its very easy to create a snowflake schema for this. Lets see how it would work in Power BI
I have loaded the data into Power BI
Fact to Dim is one to one. the relationships started as both direction. Moving the facts into a Fact table and the descriptions into a dim table in a 1 to 1 relationship is called creating a Degenerate table
I have changed it to 1 to many to get the single direction filter and the fact that the event table snowflaked off from events creates 1 to many.
If you however leave the model as one to one, both direction, it makes no difference to the results. Seperating the fact and dim when the relationship is a 1 to 1 is knows as a
this is an interesting look at the results. there are two events.
When you bring in data from the Snowflake dimension, it still shows the 2 event metric. However the metric looks a little confusing if dont understand the data
For example, the table with EventDetail and Event should sum to 12 (the 12 Agenda items) but it doesn’t. It Sums to the 2 event (Which is correct for the number of events but is not correct for the number of Agenda items)
So the next question the user wants is, how many agenda items are there per event?
I created a Measure in the fact table that does a distinct count of the Event Detail IDs
# Events = DISTINCTCOUNT('Dim Event Details'[EventDetID])
Again, this works a treat. There are 12 Agenda Items
And again, the new measure shows us correctly the number of agenda items.
the snowflake seems to be doing everything that it needs to do. At this point it still feels like Snowflake is still a good option. But, how would you change this into a snowflake Schema?
Fact table 1 to many
lets have a think about how 1 to many happens using the date dimension as a good example
Notice the the dim is on the one side and the Many is on the fact side.
This is what we are looking for. the dimension holds a short fat table (One date)
And the fact table is long and thin (The date ID is in the fact table many times)
Denormalise the Event and Agenda tables looks like this
The Fact is connected to the Event and Schedule by the Event ID
I created a new measure on the new Event and Schedule Dimension
# Events (STAR) = DISTINCTCOUNT('Event And Schedule'[EventDetID])
And I now have the same visuals against the Star and the Snowflake
So lets see how they match up usingPerformance Analyser (You may have to turn it on in Options and Settings – Options – Diagnostics)
I have created a blank page to start from
From the blank page start recording
Great, you can immediately see that the STAR was quicker. We have a tiny amount of data but if you have a lot of data this will really make a difference
So we created a Star schema and sped our reports up. Although there are no issues with the snowflake, this is a clear win for the star.