Setting up a Service Principal for Azure Data Lake Gen 2 (Storage) to use with Data Factory

An Azure service principal is a security identity used by user-created apps, services, and automation tools to access specific Azure resources. Think of it as a ‘user identity’ (login and password or certificate) with a specific role, and tightly controlled permissions to access your resources

Azure Service Principal

I am constantly having to remind myself how to set up the Service Principal for Access to things like Azure Data Lake Gen 2 when I am setting up a data factory (Or using the storage with another web app).

So I wanted to write a blog post specifically on this.

As the example, imagine you are moving data from an Azure SQL Database to files in Azure Data Lake Gen 2 using Azure Data Factory.

You attempt to add a Data Lake Connection but you need a Service Principal account to get everything Authorised.

You need this so the Data Factory will be authorised to read and add data into your data lake

An application (E.g. data Factory) must be able to participate in a flow that requires authentication. It needs to establish Secure credentials. The default method for this is a client ID and a Secret Key.

There are two types of permissions

Application Permissions No user context is required. The App (E.g. data Factory) needs to access the Web API By its self

Delegated Permissions The Client Application (E.g. data Factory) needs to access the Web API as a Signed in User.

Create an App

In Azure choose App Registrations

Here you can create an app – New Registration

Provide a name for your app. e.g. DataFactoryDataLakeApp

Grant your Registered App permissions to Azure Storage

This will enable your app to authorise Requests to the Storage Account With Azure Active Directory (AD)

You can get to your app by going to Azure Active Directory

Then App Registrations and choose the App

In your new App, go to Overview and View API Permissions

Next go to Add a permission

Go to Azure Storage API which contains Data Lake Gen 2

Notice that we are setting up Delegated Permissions for Azure Storage

You are warned that Permissions have been changed and you need to wait a few minutes to grant admin consent.

I am not an admin so I always get my admin to go into Azure Active Directory and Grant Admin Consent for Peak Indicators

Note that your app now has configured permissions for Azure Active Directory Graph and Azure Storage

Assign your new app to a subscription

Now you have an app you need to assign Contributor status to it to the level of service you require in Azure, Subscription level, Resource group level or resource level.

For this app I am going to set it up against the subscription. First go to the Subscription you want to add it to and then Access Control (IAM)

I have added the app as a contributor

Creating a Key Vault

We will be selecting and creating IDs in the next steps, but instead of simply remembering your secret. Why not store it in a Key Vault.

  • Centralise Application Secrets
  • Store Secrets and Keys Securely
  • Monitor Access And Use

Lets set one up in our Proof of Concept area.

Create a Key vault if you don’t have one already

remember to add any tags you need before Review + Create

Once completed you can go to the resource (E.g. Data Factory) but for the time being that is all you need to do

Application ID and Tenant ID

You can now go into your new app in Azure (App registrations) to get more details for Data Factory (When you set up the connection)

Tenant from Data Factory will be mapped to Directory (Tenant ID) from the App Overview

Service Principal ID from Data Factory will be mapped to Application (Client) ID From the App Overview

Create a Client Secret

Next, create your Client Secret.

In your App go to Certificates and Secrets

Click New Client Secret

Im going to allow this secret to Expire in a year (Anything using the app will start to fail so you would need to set a new secret and re-authorise)

We can add this into the Key vault so we don’t lose it because once you have finished here you dont see it again.

Open a new Azure Window and Go to your new Key Vault

Go to Secrets

Click + Generate Import

Notice I have set the expiration date to match the expiry date of the app

Ensuring the Access is set for the Data Lake Storage

For this you need to have a Data Lake Gen 2 set up and Microsoft Azure Storage Explorer downloaded

In Microsoft Azure Storage Explorer, navigate to the storage

Then Right click on the File System (In this case factresellersales) go to Manage Access and add the app.

Notice that we have set Read Write and Execute for the app on the file system and all the files will inherit these permissions

Adding The Data Lake Gen 2 Connector in Data Factory (Test)

I have a Data Lake Gen 2 with some files and I want to move them into a SQL Data base.

To test, Open or create a Data Factory

Go into Author and Monitor. Then Author

Go to Connections, +New and Choose Azure Data Lake Gen 2

Tenant = Directory (Tenant ID) from the App Overview

Service Principal ID = Application (Client) ID From the App Overview

Service Principal Key (You can get it from Azure Key Vault. Click ON secrets,Then the name and current version

You can then copy the secret value and add it into Data Factory

Test your Connection

Create the Data Lake Data Set

Here is where you know that all your efforts all worthwhile.

Create a new Dataset which will be an Azure Datalake Gen 2

This is great. I have access to the files in the data lake. Achievement unlocked.

One Reply to “Setting up a Service Principal for Azure Data Lake Gen 2 (Storage) to use with Data Factory”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s