Using Azure AD with the Azure Databricks API

In the past, the Azure Databricks API has required a Personal Access Token (PAT), which must be manually generated in the UI. This complicates DevOps scenarios. A new feature in preview allows using Azure AD to authenticate with the API. You can use it in two ways:

  • Use Azure AD to authenticate each Azure Databricks REST API call.
  • Use Azure AD to create a PAT token, and then use this PAT token with the Databricks REST API. Note that there is a quota limit of 600 active tokens.

See further down for options using Python or Terraform.

Ensure your service principal has Contributor permissions on the Databricks workspace resource.

Option 1 – using Azure CLI

The easiest way is to use Azure CLI. Log in to Azure with a user account or service principal that has Contributor permissions on the workspace.

# Change these values

tenantId=$(az account show --query tenantId -o tsv)
wsId=$(az resource show \
  --resource-type Microsoft.Databricks/workspaces \
  --query id -o tsv)

# Get a token for the global Databricks application.
# The resource name is fixed and never changes.
token_response=$(az account get-access-token --resource 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d)
token=$(jq .accessToken -r <<< "$token_response")

# Get a token for the Azure management API
token_response=$(az account get-access-token --resource
azToken=$(jq .accessToken -r <<< "$token_response")

# Use both tokens in Databricks API call
curl -sf \
  -H "Authorization: Bearer $token" \
  -H "X-Databricks-Azure-SP-Management-Token:$azToken" \
  -H "X-Databricks-Azure-Workspace-Resource-Id:$wsId"

# You can also generate a PAT token. Note the quota limit of 600 tokens.
api_response=$(curl -sf \
  -H "Authorization: Bearer $token" \
  -H "X-Databricks-Azure-SP-Management-Token:$azToken" \
  -H "X-Databricks-Azure-Workspace-Resource-Id:$wsId" \
  -d '{ "lifetime_seconds": 100, "comment": "this is an example token" }')
export DATABRICKS_TOKEN=$(jq .token_value -r <<< "$api_response")

Databricks CLI will use the DATABRICKS_TOKEN and DATABRICKS_HOST environment variables as configuration.

Option 2 – using cURL

If you do not wish to use the Azure CLI, you can also use REST queries with cURL directly.

# Change these values.
# Use a Client ID with Contributor permissions
#   on the Databricks workspace.

tenantId=$(az account show --query tenantId -o tsv)
wsId=$(az resource show \
  --resource-type Microsoft.Databricks/workspaces \
  --query id -o tsv)

getToken () {
  token_response=$(curl -X GET \$tenantId/oauth2/token \
    -H 'Content-Type: application/x-www-form-urlencoded' \
    -d "grant_type=client_credentials&client_id=$CLIENT_ID&resource=$1&client_secret=$CLIENT_SECRET"
  jq .access_token -r <<< "$token_response"

# Get a token for the global Databricks application. This value is fixed and never changes.
token=$(getToken 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d)

# Get a token for the Azure management API

# Use both tokens in Databricks API call
curl -sf \
  -H "Authorization: Bearer $token" \
  -H "X-Databricks-Azure-SP-Management-Token:$azToken" \
  -H "X-Databricks-Azure-Workspace-Resource-Id:$wsId"

# You can also generate a PAT token. Note the quota limit of 600 tokens.
curl -sf \
  -H "Authorization: Bearer $token" \
  -H "X-Databricks-Azure-SP-Management-Token:$azToken" \
  -H "X-Databricks-Azure-Workspace-Resource-Id:$wsId" \
  -d '{ "lifetime_seconds": 100, "comment": "this is an example token" }'

Option 3 – Using Python

I have published a Python module to easily interact with Databricks with PAT tokens or AAD:

If you want to implement the logic yourself, the easiest way is to use the azure-core module to access Azure CLI credentials from Python.

    import requests
    from azure.common.credentials import get_azure_cli_credentials
    resource_group = "MY_RESOURCE_GROUP"
    databricks_workspace = "MY_WORKSPACE"
    dbricks_location = "northeurope"

    credentials, subscription_id = get_azure_cli_credentials()
    dbricks_api = f"https://{dbricks_location}"
    # Get a token for the global Databricks application. This value is fixed and never changes.
    adbToken = credentials.get_token("2ff814a6-3304-4ab8-85cb-cd0e6f879c1d").token
    # Get a token for the Azure management API
    azToken = credentials.get_token("").token
    dbricks_auth = {
        "Authorization": f"Bearer {adbToken}",
        "X-Databricks-Azure-SP-Management-Token": azToken,
        "X-Databricks-Azure-Workspace-Resource-Id": (
    requests.get(f"{dbricks_api}/instance-pools/list", headers= dbricks_auth).json()

See Part 2, Provisioning Azure Databricks and PAT tokens with Terraform, for a Terraform template which fully automates the provisioning process.

Software Engineer at Microsoft, Data & AI, open source fan