June 17, 2020 – Debbies Microsoft Power BI, SQL Fabric and Azure Blog

Dataflows allow you to separate your data transformations from your data sets. we can now centralise the dataflow and connect to the flow in power BI Desktop instead of having to connect to the data.

to create a dataflow, instead of getting data within the Power BI desktop pbix file, you create a dataflow in Power BI Service

This this takes you to Power Query Editor within the Service, where you can do all your transformations.

Dataflows are held in the gen2 data lake. See Use Data Lake Storage V2 as Dataflow Storage which gives more information on setting up your own data lake as storage.

This is a great feature and goes along way to providing us with our Modern Enterprise Data Experience.

there is only ever one owner of a dataflow, which means that if you are not the owner you cannot work on the dataflow. Thank fully there is now a Take Over Option.

If you are not the owner and you are a Workspace admins or member you can take over ownership to work on the flow by going to

Settings(In the top menu bar)> Settings

And then to dataflows. If you arent the owner you will see the takeover button that allows you to work on the dataflow.

This is great but we did have some questions

If the Initial owner added the Credentials to the Data Source to create the dataflow, does every Owner need to also add in the credentials?

From our Testing, when you take over the Data flow, you need to reset the Gateway, but you dont have to worry about the credentials.

This happens every time you take over the data flow

Dataflows connected to on premises data sources. How do we use the Gatway?

You cant use the Personal model of Gateway with a dataflow. You must use the Enterprise Gateway
You must have a Power BI Pro License to use the Gateway
The connections use the authentication and user information input into the data source configuration. When using takeover, these need to be re established.

There are three developers working on a solution. Two have been at the company for a while and have no issues accessing the dataflow. However, when I take over the data flow and get to the Power Query Editor in Service I hit the following error

Clearly I, as yet don’t have access to the Enterprise Gateway

Gateway checklist

Installing the Gateway (You can download the Gateway from Power BI Service)

In this Example, the Gateway has been installed.

Step 1 is to Install it on a server.

2. Sign into your Power BI Account to configure the gateway

Once installed it needs to be configured. In Power BI Service, Settings, Manage Gateways
Decide upon a name and a recovery key (And make sure you don’t lose the key)

3 Add Data Sources to the Gateway

Server Name
Database Name
User Name
Password
Security or Authentication methods
Advanced setting specific to just that data source type
Again, because Laura can access the data source, this has all clearly been set up. Server ds2chl261 Database AppDB

4. Tell the Gateway who can access the data source.

Click the Users Table and add the users of the data source.

5. Amending the Gateway

Again you go into Settings and Manage Gateways
Choose your the Data source ( Server and Database ) and go to the users Tab to add Users
In the above example, the gateway is being amended by adding a new user.

Once in place, the user will simply be able to select the Gateway as above.

Preview of Promoted Dataflows

You can now as of June 2020, set your dataflow to promoted to ensure everyone knows that this a dataflow that has been created and managed centrally, possibly by your BI team. It will eventually give you your ‘one view of the truth’

Go to Settings for your dataflow

And Promote the dataflow in the same was as you can with a dataset.

Only do this if its established that this is now centrally managed rather than self service data mashup

Note that certified isn’t yet enabled because this is done by your Power BI Admin

I attended the Power BI conference from http://powerbiconference.com/ and got so many great ideas on the 16th June

One was from the session related to DAX and Power BI Models. We have already spoken about Steering clear from flat files. Power BI prefers the STAR Schema model comprising of a fact table and Dimensions providing all the filters for the metrics

I however, do seem to use snowflake schemas and it seems that even these aren’t recommended for Power BI.

So i wanted to have a look at this and see why I do it and what it actually means for Power BI. Also, how can I change my thinking on Snowflake to STAR

Example 1

This is very similar to something I am working on at the moment. We are looking at Events and these events consist of Agenda items. Its very easy to create a snowflake schema for this. Lets see how it would work in Power BI

I have loaded the data into Power BI

Fact to Dim is one to one. the relationships started as both direction. Moving the facts into a Fact table and the descriptions into a dim table in a 1 to 1 relationship is called creating a Degenerate table

https://docs.microsoft.com/en-us/power-bi/guidance/relationships-one-to-one

I have changed it to 1 to many to get the single direction filter and the fact that the event table snowflaked off from events creates 1 to many.

If you however leave the model as one to one, both direction, it makes no difference to the results. Seperating the fact and dim when the relationship is a 1 to 1 is knows as a

this is an interesting look at the results. there are two events.

When you bring in data from the Snowflake dimension, it still shows the 2 event metric. However the metric looks a little confusing if dont understand the data

For example, the table with EventDetail and Event should sum to 12 (the 12 Agenda items) but it doesn’t. It Sums to the 2 event (Which is correct for the number of events but is not correct for the number of Agenda items)

So the next question the user wants is, how many agenda items are there per event?

I created a Measure in the fact table that does a distinct count of the Event Detail IDs

# Events = DISTINCTCOUNT('Dim Event Details'[EventDetID])

Again, this works a treat. There are 12 Agenda Items

And again, the new measure shows us correctly the number of agenda items.

the snowflake seems to be doing everything that it needs to do. At this point it still feels like Snowflake is still a good option. But, how would you change this into a snowflake Schema?

Fact table 1 to many

lets have a think about how 1 to many happens using the date dimension as a good example

Notice the the dim is on the one side and the Many is on the fact side.

This is what we are looking for. the dimension holds a short fat table (One date)

And the fact table is long and thin (The date ID is in the fact table many times)

Denormalise the Event and Agenda tables looks like this

The Fact is connected to the Event and Schedule by the Event ID

I created a new measure on the new Event and Schedule Dimension

# Events (STAR) = DISTINCTCOUNT('Event And Schedule'[EventDetID])

And I now have the same visuals against the Star and the Snowflake

So lets see how they match up usingPerformance Analyser (You may have to turn it on in Options and Settings – Options – Diagnostics)

I have created a blank page to start from

From the blank page start recording

Great, you can immediately see that the STAR was quicker. We have a tiny amount of data but if you have a lot of data this will really make a difference

So we created a Star schema and sped our reports up. Although there are no issues with the snowflake, this is a clear win for the star.

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Day: June 17, 2020

Power BI dataflows – Takeover