Azure Data Factory - Run single instance of pipeline at a time

Article showing how to run only a single instance of a pipeline at a time

Posted by thebernardlim on May 7, 2020

I recently faced this scenario where I noticed there were multiple instances of my pipeline running concurrently. The reason this happened was because the runtime of my pipeline exceeded that of my trigger interval.

I googled around trying to find a proper solution for this and the best I could find was to set the ‘Concurrency’ value of the pipeline to ‘1’. This ensured that only 1 instance of the pipeline will run at a time.

Concurrency Setting

Problem

The only problem was if there is an existing pipeline still running and the trigger interval kicks in, this upcoming run will be queued.

In the event, say the interval is every hour, and my pipeline run suddenly takes 10 hours, this will mean 10 pipeline runs being queued!

Apparently there is still no out-of-the-box solution for this, which led me to further investigate other options.

Solution

One solution we can handle this will be by making use of the Data Factory REST API.

There is this article written that describes how he used REST API to access Data Factory pipeline and get their current statuses. There is also sample code provided which I used as my starting point.

In simple terms, all I needed to do was to:

  1. Create a Function App.
  2. Create a HTTP Function which will:
    • Send a GET request to ADF REST API to retrieve the current pipeline run status.
    • If status is ‘Succeeded’, send a POST request to to ADF REST API to create a pipeline run.
    • If status is not ‘Succeeded’, send a response to user saying new pipeline run is not ready to be created.

My code for this function can be found here under on my GitHub repository. The function to do this has been named as ‘RunPipelineByName’

Testing

Let’s test this out!

  1. In order to test this, you can install the following script in Postman.
  2. Once installed, there will be a list of requests as per the screenshot below.

    Postman Script Overview

  3. As we will be triggering the ADF Pipeline ‘createrun’ API which requires an access token, we will need to make first make a call to Microsoft Identity Platform to get the access token. Before this, you will need to create an App Registration within Azure Active Directory

    Pipeline Security

    Update the global variables of the script with your respective values.

    Postman Variables

  4. Once the variables are set, send a ‘Get AAD Token’ request. This will set an access token to a global variable which will be used in our next API call.

  5. Send a ‘Run Single Instance ADF Pipeline’ request. If the ‘Status’ returns as ‘Pipeline run successfully created’ then your pipeline is now running!

    Pipeline Run Created