Design a site like this with WordPress.com
Get started

Azure SQL Database Dev to Production Part 4

I have had quite a lot of issues in regards to the whole dev to prod process with the SQL database. my last attempt which I wrote in this blog worked well until I shut the project down. Then once reopened I would always lose my Project or GIT, so I went back to the drawing board and did a lot more research and here are my new findings.

There is a Data Factory part to this but I have already made a blog about this and it has consistently worked ever since setting up the dev to prod process

Resources used

  • Azure SQL Database and Server
  • Visual Studio (Enterprise)
  • Azure DevOps

Azure Devops Repository

First of all you need to have an Azure Devops set up (I wont go into detail on this here)

In the Devops repos I have a Folder for Data Factory in the repository. The Folder for SQLDB will be created later.

  • In Devops Ensure you have a GIT repos Created Then Click CLONE to copy the GIT location URL

In this example I am cloning right at the top of the repository

  • Click the Copy Button

Visual Studio

You cannot do this as yet in Visual Studio code. It has to be Visual Studio and I have Visual Studio Enterprise 2019

  • You can Copy the Repository Location from the Clone Copy (Or Browse a Repository – Azure DevOps or GitHub)
  • make sure you are happy with the path for the Local Copy
  • Click Clone. Your Local repository is then shows in Solution Explorer
  • This has added the folder on your C Drive (It added the top level and the dataFactory and SQL DB Folders)
  • And You can see this project in Solution Explorer
  • Copy path of C Drive (And the folder for example SQLDB)
  • Visual Studio – In the top bar choose create File – New – Project
  • Choose SQL Server Database Project – Next

The Project with be SQLDB and will contain the SQL Objects

  • Click Create

In Visual Studio SOLUTION EXPLORER: You can see your empty database objects.

On the C Drive, Note you now have a SQLDB folder along with the Data Factory Folder

  • Right click on the database name in Solution Explorer and go to properties

Its important to be in the right version for the target platform

  • Right click on database name and Import – database
  • Select the Connection location of the development database (and Show connection properties to make sure your username and password are ok and the database connects.
  • Import objects into the Local Project (No need to Select it in the above box)
  • then Click Finish Note that all your objects are now in Solution Explorer and on the C: Drive (Your local copy)
  • Is the project Complete? – Build – Rebuild solution which checks and validates the objects

Any time anything changes you need to rebuild your solution to update the code.

Warnings and Errors

For Warning and errors you can see all the issues by clicking on them. The build may fail because of errors. These always need resolving before you send to the target DB e.g. Production

Error Example Warning:  SQL71558: The object reference [staging].[].[KEY] differs only by case from the object definition [staging].[ST2].[Key].

  • Click click on the database in Solution Explorer and go to Properties.
  • In project Settings untick validate casing on identities

Error Example Warning SQL71502: Procedure: [dim].[USP_Date] has an unresolved reference to object [sys].[all_objects].

You can add the master database as a reference (Right Click on references)

Add Database Reference

Rebuild your codebase. Its important here to make sure your warnings and errors have been dealt with

Rebuild updates your project locally after updating – I will look at how making changes with for example SQL Server Management Studio changes the process in a later blog.

  • Publish to Git Repository – GIT Changes and make a note of your change Commit All

Then click the arrow to Push changes to the GIT repository

We now have the code in the repository in DevOps

It seems annoyingly easy to slightly mess your folder structure up. Here I have a SQL DB Folder and another SQLDB Folder inside.

I only wanted the one. This does keep happening to me and its very frustrating. Any pointers to where I went wrong would be really appreciated

Create your CI (Continuous Integration) Pipeline in Azure Devops

Now we have the Code in GIT we can create our artifacts for the release pipeline.

  • In Azure Devops go to Pipelines and Release Pipelines
  • Click New Pipeline
  • And choose your repository
  • Select a template

Right click on tasks and remove selected tasks until you are left with the following

You don’t really have to do much with these three jobs

At the Pipeline level ensure you use the right Agent. For example Windows -2019. We had errors because we use an  OpenJson function in the SQL code but setting the right agent resolved this issue

All the other jobs are parameterised. This should now be all set

  • Save and Queue and you can then run the pipeline to create your artifacts Save and Run
  • There may be warnings here. For some reason the warnings you clear in Visual Studio seem to show in Devops. I would like to do a bit more research on this.
  • But If a warning hasn’t failed the process you should now have your Continuous Integration artifacts.

Create your CD (Continuous Delivery) Release Pipeline in Azure Devops

Now we are onto Continuous Delivery. Moving the new code into Azure SQL DB

  • In Azure Devops to to Pipelines – Releases – New release
  • So for Artifacts click Add

We want to use the latest build

now Add a Stage. In our case we are using Prod so its a simple release

  • Start with an Empty Job

Add a task to the job

Here we chose the production Subscription

And we link to the DACPAC file that was created with the build from visual Studio. The DACPAC contains all the objects in SQL

The database uses variables and you can set these up in the variables tab

You can create a release to update your Production database

Once pushed, check your SQL database in Production to make sure you are happy that your changes have gone through.

And you can save your visual studio project and reopen. the next stage is to update some objects and go through the process again so watch this space

Advertisement

Azure Data Factory Moving from development and Production – Part 2. Using Key vault for Linked Services

In Azure Data Factory Moving from development and Production We looked at how we can use Azure DevOps to move the Json Code for Development Data Factory from development to Production.

Its going well, I have however been left with an issue. every time I move into Production details for the Linked Services have to be re added. Lets have a look at the SQL Server and the Data Lake gen 2 account.

Development

Notice that the information has been entered manually including the Storage account Key.

Again, in this instance the information has been entered manually. SQL Server Authentication is being used because we have a user in the SQL DB with all the privileges that Data Factory Needs.

DevOps Data Factory release Pipeline

Go into Edit of the Release Pipeline

Within Prod Stage we have an Agent Process

We are looking for the Section Override Template Parameters

Note that currently Account Key and SQL Database Connection String are null.

Provisioning Azure Key vault to hold the Secrets

Managed Identity for Data Factory

Copy your Azure Data Factory Name from Data Factory in Azure

You need to have a Key vault set up in Development

GET and LIST allows Data Factory to get information from the Key Vault for secrets

Paste the data factory name into Select Principal

Key Vault, create a Secret for the Azure Data Lake Storage

For the Key Vault Secret. I gave it the Secret value by copying across the Access Key from the Azure Storage Account Keys Section

The Content type was simply set as the name of the Storage Account for this excercise

In Data Factory Create a Linked Service to the Key Vault

Test and ensure it successfully connects

Use the New Key Vault to reset the data Lake Linked Service

How does this Data Lake Linked Service change the DevOps Release Pipeline?

Its time to release our new Data factory settings into Production. Make sure you have Published Data Factory into Devops Git.

Production Key vault Updates

We need to update Production in the same way as Development

  1. In Production Key vault add the Production data factory name to Access Policies (as an Application) With Get and List on the Secret
  2. Ensure that there is a Secret for the Production Data Lake Key AzureDataLakeStorageGen2_LS_accountKey
  3. Check your Key vault connection works in Production before the next step

Azure DevOps Repos

In Azure Devops go to your Data Factory Repos

Notice that your Linked Service information for the Data Lake now mentions the Key Vault Secret. its not hardcoded anymore which is exactly what we want

Azure DevOps Release Pipeline

Go to Edit in the Data Factory release pipeline

When the job in Prod is clicked on, you can go to the Override Parameters Section. And notice there is now an error

AzureKeyVault1_properties_typeProperties_baseUrl is the missing Parameter. Basically at this point you need to delete the code in the Override template Parameters box and then click the button to regenerate the new parameters

Override with production information (I saved the code so I could re copy the bits I need.

Once done, notice that the -AzureDataLakeStorageGen2_LS_accountKey “” parameter is now gone because its being handled by the key vault.

Lets Save and Create a Release

New failures in the Release

2021-02-08T13:45:13.7321486Z ##[error]ResourceNotFound: The Resource ‘Microsoft.DataFactory/factories/prod-uks-Project-adf’ under resource group ‘prd-uks-Project-rg’ was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix

Make sure that your override parameters are ok. I updated:

  • Data Factory name from Data Factory
  • Primary endpoint data Lake Storage from Properties
  • Vault URI from Key vault Properties

Repeat the Process for SQL Database

With everything in place we need to introduce a connection string into Key Vault

I have a user set up in my SQL database. the user has GRANT SELECT, INSERT, UPDATE, DELETE, EXEC, ALTER on all Schemas

I want to include the user name and password in the Connection string and use SQL authentication

Secret Name

AzureSQLDatabase-project-ConnectionString

Secret

 Server=tcp:dev-uks-project-sql.database.windows.net,1433; Database= dev-uks-project-sqldb; User Id= projectDBowner;Password= Password1;

the connection string has been set as above. For more information on connection strings see SQL Server connection Strings

Go back to Data factory and set up the new secret for SQL Server

This is successful

Data Factory and DevOps

  1. back in Data Factory Publish the new linked Service Code
  2. go into Dev Repos and check in Linked Service code you are happy with the new Key vault information
  3. Go to Prod Key vault and make sure you are the Secret is set with the Connection String for SQL DB
  4. Test the Key vault secret works in Prod
  5. Back in DevOps Go to Release pipelines and Edit for the adf Release CD pipeline (Release Pipelines are Continuous Delivery. Build pipelines are CI for Continuous Integration)
  6. Edit Prod Stage (I only have Prod) Arm Template Deployment Task, Copy Overwrite Template Parameters code into a file for later
  7. Delete the code and click the … to get the latest parameter information
  8. Re add your production parameters, most can be taken from the code you just copied.
  9. Create a new Release
  10. go to Linked Services in Data Factory and check they are still Production. They still use Key vault and they still work

Now this is all in place, Development Data factory can be published up to production. there is no need to reset Linked Services and all your information about Keys and passwords are hidden in the Key Vault

Azure Data Factory – Moving from Development to Production

When working on larger projects we need to merge changes from Developers. When all the changes are in the central branch we can then have an automated process to move development to Production

Smoke tests

In computer programming and software testing, smoke testing is preliminary testing to reveal simple failures severe enough to, for example, reject a prospective software release

Integration testing

Integration testing is the phase in software testing in which individual software modules are combined and tested as a group. Integration testing is conducted to evaluate the compliance of a system or component with specified functional requirements. It occurs after unit testing and before validation testing

Resources Involved with the current Project

  • Azure DevOps
  • Azure SQL Server
  • Azure SQL Database
  • Azure Data Factory
  • Azure Data Lake Gen 2 Storage
  • Azure Blob Storage
  • Azure Key vault

Each resource has its own specific requirements when moving from Dev to Prod.

We will be looking at all of them separately along with all the security requirements that are required to ensure that everything works on the Production side

This post specifically relates to Azure Data Factory and DevOps

Azure data factory CI/DC Lifecycle

GIT does all the creating of the feature branches and then merging them back into main (Master)

Git is used for version controlling.

In terms of Data Factories, you will have a Dev Factory, a UAT factory (If Used) and a Prod Data factory. You only need to intergrate your development data factory with GIT.

The Pull request merges feature into master

Once published we need to move the changes  to the next environment, in this case Prod (When ready)

This is where Azure Devops Pipelines come into play

If we are using the Azure Devops Pipelines for continuous development the following things will happen

  • The devops Pipeline will get the powershell script from the master branch
  • The get the ARM template from the publish branch
  • Deploy the Power Shell script to the next environment
  • Deploy the arm template to the next environment

Why use Git with data Factory

  • Source control allows you to track and audit changes
  • You can do partial saves when for example you have an error. Data Factory wont allow you to publish but with Git you can save where you are and then resolve issues another time
  • It allows you to collaborate more with team members
  • Better CI/CD when deploying to multiple environments
  • Data Factory is time times faster with a GIT back end that it is when authoring against the data factory service because resources are downloaded from GIT
  • Adding your code into Git rather than simply into the Azure Service is actually more secure and faster to process

Setting Up Git

We already have an Azure Devops Project with Repos and Pipelines turned on

We already have an Azure Subscriptions and Resource Groups for both Production and Development Environments

There is already a working Data Factory in development

In this example Git was set up through the Data Factory management hub (the toolbox)

DevOps Git was used for this project rather than GitHub because we have Azure DevOps

Settings

The Project Name matches the Project in Devops

The Collaboration branch is used for Publishing and by default it’s the master branch. You can change the setting in case you want to publish from another branch.

Import existing resources to repository means that all the work done before adding Git can be added to the repository.

Devops is now set up.

Close Azure Data Factory so we can reopen it again to go through the GIT process (If it is open)

Where to find your Azure DevOps

You should now be able to select your own Area in Devops / Repos and select the created project within Azure Devops and Repos

You will need an Account in Azure DevOPs Click on Repos and that account must be higher than Stakeholder to access Repos

You can then select the Project you have created

Using Git with Azure Data Factory

In Azure Open up your Data Factory

Git has been Enabled (Go to Manage to review Git)

The master branch is the main branch with all the development work on it

We now develop a new Feature. Create a Feature Branch + New Branch

We are now in the feature branch and I am simply adding a description to a Stored Procedure Activity in the pipeline. However this is where you will now do your development work rather than within the master

For the test, the description of a Pipeline is updated. Once completed my changes are in the feature 1 branch. I can now save my feature

You don’t need to publish to save the work. Save all will save your feature, even if there are errors

You can go across to Devops to see your Files and history created in Azure Devops for the feature branch (We will look at this once merged back into Production)

Once happy Create the pull request

This takes you to a screen to include more details

Here I have also included the Iteration we are currently working on in Devops Boards.

A few tags are also added. Usually, someone will review the work and will also be added here.

The next screen allows you to approve the change and Complete the change

In this case I have approved. You can also do other things like make suggestions and reject

Completing allows us to complete the work and removes the feature branch. Now all the development in the feature branch will be added to the main, master branch.

In Data Factory, go back to the master branch and note that your feature updates are included

We now publish the changes  in Master Branch which creates the adf publish branch. This publish branch creates the ARM template that represents the pipelines, linked services, triggers etc.

Once published, In Devops Repos , there are now files to work with

You can see your change within the master branch

(The changes would normally be highlighted on the two comparison screens)

Here we open the Pipelines folder got the compare tab and find the before and after code

And you can also see your history

The Arm templates is in the adf_Publish branch, if you select this branch

Once done we need to move the changes  to the next environment, in this case Prod (When ready)

This is where Azure Devops Pipelines come into play

Continuous Development using Azure DevOps

We need another Data Factory object to publish changes to

In this case, the Production has been created with Azure Portal within the Production Subscription and Production Resource Group

Git Configuration is not needed on the Production resource. Skip this step

Create your tags and Review and Create

DevOps Pipelines

For this specific Project, We don’t want to update production automatically when we publish to Dev. We want this to be something that we can do manually.

Go to Pipelines and create a new release Pipeline (In DevOps)

Click on Empty job because we don’t want to start with a template

And because for this project there is no UAT, just Production name the Release Pipeline Prod

Click on the X to close the blade

We need to sort out the Artefact section of the Pipeline
 
Click on Add an Artefact and choose an artefact from Azure Repos
 
We may as well add adf_Publish branch which contains the ARM templates
And the Master branch

the Source alias was updated with _adf_publish

Both Pipelines are Azure Repos artefacts

Next We move to Prod and Start adding tasks

Click on 1 job, 0 tasks to get to tasks

Click + against Agent Job to add the task Our task is for ARM Template deployment


Click Add

Then click on the new Task to configure

The first section is where you select your production environment

Next you need to select the ARM template and the ARM template parameters file. These are always updated in the Devops artefact everytime you publish to dev.

The JSON templates are in the adf_publish branch

Now you need to override the template parameters because these are all for Dev and we need them to be production.  These are:

These will be specific to your own data Factory environment. In this instance we need to sort out the information for the Key vault and data lake storage account

factoryName

This one is easy. The only difference is changing dev to prd

AzureDataLakeStorageGen2_LS_properties_typeProperties_url

The Production Data lake Storage account must be set up in dev and prod before continuing. Go to this Storage account resource

This information is also stored in our Key Vault as a secret which we can hopefully use at a later date.

It is taken from Storage Account, Properties. We want the Primary endpoint Storage for the data lake

Copy the Primary Endpoint URL and override the old with the new Prod URL in DevOps

AzureKeyVault1_properties_typeProperties_baseUrl

We need to update https://dev-uks-project-kv.vault.azure.net/

Lets get this overridden. We already have a Key vault Set up in production. Get the URI from Overview in the Production Key Vault Service

and lets add this into our DevOps parameter

AzureDataLakeStorageGen2_LS_accountKey

This is empty but we could add to it later in the process.

Account keys are the kind of things that should be kept as secrets in Key vault in both Dev And Prod

Lets get them set up. Just for the time being, lets ensure we have the Data Lake storage account key within our development and Production Key vaults

Key Vault.   

Within Key vault in development create a secret with the nameAzureDataLakeStorageGen2LSaccountKey

And the key from the storage account comes from……

And Repeat for Production Key vault

For the time being through lets leave this blank now we have captured the information in the key vault. It should come useful at a later date

AzureSqlDatabaseTPRS_LS_connectionString

This was also empty within the parameters for dev.

You can get the connection string value by going to your SQL data base. Connection Strings. PHP and finding the try statement

And here is the Connection String value for production

Server=tcp: prd-uks-project-sql.database.windows.net,1433; Database= prd-uks-project-sqldb;

You can also add this information into Key Vault as a secret and repeat for Production

For the first instance we are going to leave empty as per the dev parameters. At some point we should be able to set up the Security Principal so we can change the hardcoded values to Secrets

The parameters created in dev are now overridden with the production values

The Pipeline is then named

Create a release

Once Saved. Click back on Releases

For this tye of release we only want to do it manually

Create a Release for our very first manual release

Click back on releases

And click on release 1 to see how it is doing

You can click on Logs under the Stages box to get more information

Now you should be able to go back to the production data Factory and see that everything has been set up exactly like Dev.

Go and have a look at linked Services in the Production data Factory

Note that they are all set with the Production information

We now have a process to move Dev to Prod whenever we want

The Process

Throughout the sprint, the development team will have been working on Feature branches. These branches are then commited into the master pipeline and deployed to Dev

Once you are happy that you want to move your Data Factory across from dev into Prod. Go to DevOps Release pipeline

Create Release to create a new release

It uses the Artefact of the Arm template which is always up to date after a publish.

This will create a new release and move the new information to Prod

All your resources will be to be able to quickly move from Dev to Prod and we will look at this in further posts