Azure Cloud


The support for Azure Cloud requires Nextflow version 21.04.0 or later. If you don’t have it installed use the following command to download it in your computer:

curl | bash
./nextflow self-update

Azure Blob Storage

Nextflow has built-in support for Azure Blob Storage. Files stored in a Azure blob container can be accessed transparently in your pipeline script like any other file in the local file system.

The Blob storage account name and key needs to be provided in the Nextflow configuration file as shown below:

azure {
  storage {
    accountName = "<YOUR BLOB ACCOUNT NAME>"
    accountKey = "<YOUR BLOB ACCOUNT KEY>"

As an alternative to the account key it can also used Shared Access Token using the setting sasToken in place of the accountKey attribute.


When creating the Shared Access Token make sure to allow the resource types Container and Object and allow the permissions: Read, Write, Delete, List, Add, Create.


The value of sasToken is the token stripped by the character ? from the beginning of the token.

Once the Blob Storage credentials are set you can access the files in the blob container like local files prepending the file path with the az:// prefix followed by the container name. For example, having a container named my-data container a file named foo.txt you can access it in your Nextflow script using the following fully qualified file path az://my-data/foo.txt.

Azure Batch

Azure Batch is a managed computing service that allows the execution of containerised workloads in the Azure cloud infrastructure.

Nextflow provides a built-in support for Azure Batch which allows the seamless deployment of a Nextflow pipeline in the cloud offloading the process executions as Batch jobs.

Get started

1 - Create a Batch account in Azure portal. Take note of the account name and key.

2 - Make sure to adjust your quotas on the pipeline’s needs. There are limits on certain resources associated with the Batch account. Many of these limits are default quotas applied by Azure at the subscription or account level. Quotas impact on the number of Pools, CPUs and Jobs you can create at any given time.

3 - Create a Storage account and, within, an Azure Blob Container in the same location where the Batch account was created. Take note of the account name and key.

4 - Associate the Blob Storage account with the Azure Batch account.

5 - Make sure your pipeline processes specify one or more Docker containers by using the container directive.

6 - The container images need to be published into Docker registry such as Docker Hub, Quay or Azure Container Registry that can be reached by Azure Batch environment.

A minimal configuration looks like the following snippet:

process {
  executor = 'azurebatch'

azure {
  storage {
    accountName = "<YOUR STORAGE ACCOUNT NAME>"
    accountKey = "<YOUR STORAGE ACCOUNT KEY>"
  batch {
    location = '<YOUR LOCATION>'
    accountName = '<YOUR BATCH ACCOUNT NAME>'
    accountKey = '<YOUR BATCH ACCOUNT KEY>'
    autoPoolMode = true

In the above example, replace the location and the account placeholders with the value corresponding to your configuration and save it to a file named nextflow.config.

Given the previous configuration, launch the execution of the pipeline using the following command:

nextflow run <PIPELINE NAME> -w az://YOUR-CONTAINER/work

Replacing <PIPELINE NAME> with a pipeline name e.g. nextflow-io/rnaseq-nf and YOUR-CONTAINER a blob container in the storage account defined in the above configuration.

See the Batch documentation for further details about the configuration for the Azure Batch service.

Pools configuration

When using the autoPoolMode setting Nextflow automatically creates a pool of computing nodes to execute the jobs run by your pipeline. By default it only uses 1 compute node of Standard_D4_v3 type.

The pool is not removed when the pipeline execution terminates, unless the configuration setting deletePoolsOnCompletion=true is added in your pipeline configuration file.


Don’t forget to clean up the Batch pools to avoid in extra charges in the Batch account or use the auto scaling feature.


Make sure your Batch account has enough resources to satisfy the pipeline’s requirements and the pool configuration.

Pool specific settings, e.g. VM type and count, should be provided in the auto pool configuration scope, e.g.

azure {
    batch {
        pools {
            auto {
               vmType = 'Standard_D2_v2'
               vmCount = 10

Named pools

If you want to have a more precise control on the computing nodes pools used in your pipeline using a different pool depending on the task in your pipeline, you can use the Nextflow queue directive to specify the ID of a Azure Batch compute pool that has to be used to run that process’ tasks.

The pool is expected to be already available in the Batch environment, unless the setting allowPoolCreation=true is provided in the batch setting in the pipeline configuration file. In the latter case Nextflow will create the pools on-demand.

The configuration details for each pool can be specified using a snippet as shown below:

azure {
    batch {
        pools {
            foo {
               vmType = 'Standard_D2_v2'
               vmCount = 10

            bar {
                vmType = 'Standard_E2_v3'
                vmCount = 5

The above example defines the configuration for two node pools. The first will provision 10 compute nodes of type Standard_D2_v2, the second 5 nodes of type Standard_E2_v3. See the Advanced settings below for the complete list of available configuration options.

Requirements on pre-existing named pools

When Nextflow is configured to use a pool already available in the Batch account, the target pool must satisfy the following requirements:

1 - the pool must be declared as dockerCompatible (Container Type property)

2 - the task slots per node must match with the number of cores for the selected VM. Nextflow would return an error like “Azure Batch pool ‘ID’ slots per node does not match the VM num cores (slots: N, cores: Y)”.

Pool autoscaling

Azure Batch can automatically scale pools based on parameters that you define, saving you time and money. With automatic scaling, Batch dynamically adds nodes to a pool as task demands increase, and removes compute nodes as task demands decrease.

To enable this feature for pools created by Nextflow, add the option autoScale = true to the corresponding pool configuration scope. For example, when using the autoPoolMode, the setting looks like:

azure {
    batch {
        pools {
            auto {
               autoScale = true
               vmType = 'Standard_D2_v2'
               vmCount = 5
               maxVmCount = 50

Nextflow uses the formula shown below to determine the number of VMs to be provisioned in the pool:

// Get pool lifetime since creation.
lifespan = time() - time("{{poolCreationTime}}");
interval = TimeInterval_Minute * {{scaleInterval}};

// Compute the target nodes based on pending tasks.
// $PendingTasks == The sum of $ActiveTasks and $RunningTasks
$samples = $PendingTasks.GetSamplePercent(interval);
$tasks = $samples < 70 ? max(0, $PendingTasks.GetSample(1)) : max( $PendingTasks.GetSample(1), avg($PendingTasks.GetSample(interval)));
$targetVMs = $tasks > 0 ? $tasks : max(0, $TargetDedicatedNodes/2);
targetPoolSize = max(0, min($targetVMs, {{maxVmCount}}));

// For first interval deploy 1 node, for other intervals scale up/down as per tasks.
$TargetDedicatedNodes = lifespan < interval ? {{vmCount}} : targetPoolSize;
$NodeDeallocationOption = taskcompletion;

The above formula initialises a pool with the number of VMs specified by the vmCount option, it scales up the pool on-demand, based on the number of pending tasks up to maxVmCount nodes. If no jobs are submitted for execution, it scales down to zero nodes automatically.

If you need a different strategy you can provide your own formula using the scaleFormula option. See the Azure Batch documentation for details.

Advanced settings

The following configuration options are available:



The blob storage account name

The blob storage account key

The blob storage shared access signature token. This can be provided as an alternative to the accountKey setting.

The duration of the shared access signature token created by Nextflow when the sasToken option is not specified (default: 12h).


The batch service account name.


The batch service account key.


The batch service endpoint e.g.


The batch service location e.g. westeurope. This is not needed when the endpoint is specified.


Enable the automatic creation of batch pools depending on the pipeline resources demand (default: true)


Enable the automatic creation of batch pools specified in the Nextflow configuration file (default: false)


Enable the automatic deletion of jobs created by the pipeline execution (default: true).


Enable the automatic deletion of compute node pools upon pipeline completion (default: false).


Specify where the azcopy tool used by Nextflow. When node is specified it’s copied once during the pool creation. When task is provider, it’s installed for each task execution (default: node)


Specify the virtual machine type used by the pool identified with <name>.


Specify the number of virtual machines provisioned by the pool identified with <name>.


Specify the max of virtual machine when using auto scale option.


Enable autoscaling feature for the pool identified with <name>.


Specify the scale formula for the pool identified with <name>. See Azure Batch scaling documentation for details.


Specify the interval at which to automatically adjust the Pool size according to the autoscale formula. The minimum and maximum value are 5 minutes and 168 hours respectively (default: 10 mins)


Specify the scheduling policy for the pool identified with <name>. It can be either spread or pack (default: spread).