Setting up an Azure Data Lake V2 to use with power BI dataflows in Service (As a data source)

Previous to the brand new Azure Data Lake, I was adding all the files into Blob Storage. However Azure Data Lake V2 is built on Blob storage and DataLake V1

its built for big data and a fundamental change is that we now have a hierarchical namespace. This organises your files into directories.

So now, we can do things like use all files from a specific Directory, delete all files from a specific directory. We can categorise our files within the data lake.

Set up Azure Data Lake V2 in Azure Portal

When you go into Azure Currently and look for Data Lake V2 you can only Find Gen 1

So the question is, how do you set up Gen 2 in Azure Portal? (Currently we are on the 25th November 2019. Hopefully this may get easier in the future)

First of all I go to the Subscription where I want to add the new data Lake v2

Open up the Portal menu (Now hidden to the left of the screen)

Choose Create a resource

next Choose Storage and Storage Account

Note that the Account kind is Storage V2 (General Purpose)

Ive set the Location to North Europe, Simply because I know thats where our Power BI Data in Services is stored and I may as well stick with this.

For the time being, I am leaving everything else as standard

Next go to Advanced

the most important setting here is the Data Lake Storage Gen 2 . Enable the Hierarchical namespace and your storage account will now be created as data Lake Storage V2

Click Review and Create

Create a file System within a Container

We now have a few options available to us. I have some files to add so I am going to add them to a container

Click on Containers And then + File System

Click OK

clicking on your new storage account(s) you are told to Download Azure Storage Explorer.

I already have this Azure Storage Explorer downloaded. If you don’t have this, its something you will absolutely need to work with Azure Storage accounts.

Once downloaded Open Azure Storage Explorer

You will need to Add in your Azure Storage Accounts by clicking the little connector icon

You will be asked to sign into your Account with your Office 365 credentials and 2fa authentication

This will log you into all your Subscriptions and Services

You are good to go

Here you find your subscription, Then go to the Data Lake Storage Gen 2 and find the new File system.

I have added a folder here called Workshop1Files to my File System

Obviously Data Lake Storage gives you so many ways of working with files and automating the files to the storage area. In this case I am going to simply move a file into my new folder to work with

Double click on the folder and then Click Upload and Upload Files

And now your file is in the cloud, in an Azure Data Lake ready to use.

Connect to your Azure File with Power BI Desktop

The first test is can we access this data within Power BI Desktop.

Open Power BI Desktop and Get Data

Choose Azure Data Lake Storage Gen2 (Currently in Beta)

Add the URL

Data Lake Storage Gen2 have the following pattern https://<accountname>.dfs.core.windows.net/<filesystemname>/<subfolder> 

Data Lake Storage Gen2 have the following pattern https://<accountname>.dfs.core.windows.net/<filesystemname>/<subfolder> 

If you go to Right click on the file in Storage Explorer and go to properties, there is a difference in structure

http://<accountname&gt;.blob.core.windows.net/<filesystemname>/<subfolder>

If you try to connect with the original URL from Data Storage you get the following error

And if you change the URL from blob to dfs

There is a missing part to the puzzle. Go back to the Azure Data Lake Storage Account in Azure and Add Storage Blob Data Reader to your account

Then try again and hopefully you are in .

No need to combine because we have specified the file.

There are different ways you can load the file. I loaded one file but you can load all files in the File System

https://storageaccount.dfs.core.windows.net/filesystemname

or all files under a directory in the file system (You can include sub directories in this)

https://storageaccount.dfs.core.windows.net/filesystemname/directoryname/directoryname

Connect to your Azure File with Power BI Data Flow

I am creating data flows in the power BI Service to ensure they can be reused across the company. The question is, Can I Connect to the above File in Service via a data flow

In Power BI Service, add a Data Flow which takes you into Power BI Query Editor in the Service. I already had some data flows connected to an Azure database.

The data is in Azure Data Lake Storage so the first think I do is try the Azure route

However, there is no Azure Data Lake Storage Gen 2. This must be something coming in the future. so then I go to File and click on Get Data text / csv

You will need to add the File Path and your Credentials (As per previous advice use dfs not blob in the URL. this seems a little flaky at the moment. I choose Organisational Account first before adding the URL and then it seems to work.

Remember Go back to Azure Storage Explorer. if you click on properties, you can grab the URL from here

We don’t need a Gateway Setting up because everything is now in the cloud.

Clicking next, Nothing happens, it just keeps bouncing back to the same window.

Attempting to use the Blob Storage connector also doesn’t work (Using the Azure Account Key as authentication).

with blob in the URL
With dfs in the URL

It would appear that currently I have hit a brick wall and there is no current DLGen2 connector for Data Flows.

I will be keeping an eye open on this because obviously, when you are pushing the new generation of Data Lakes and Data Flows then there needs to be a DLGen2 Connector for Data Flows.

Update

Had a reply back on the Power BI Forum (Not a good one)

The feature haven’t been planed. If there is any new message, the document: What’s new and planned for Common Data Model and data integration  will be updated.

I have found this in Ideas

https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/38930032-add-azure-data-lake-storage-gen2-as-a-data-sourc

Please help us get this working by voting for this idea.

2 Replies to “Setting up an Azure Data Lake V2 to use with power BI dataflows in Service (As a data source)”

Leave a comment

Design a site like this with WordPress.com
Get started