Microsoft Azure
Azure extension
This extension allows you to do the following:
- Ingest data from objects stored in Azure Blob Storage.
- Write segments to Azure Blob Storage for deep storage.
- Persist task logs to Azure Blob Storage for long-term storage.
To use this Apache Druid extension, include druid-azure-extensions
in the extensions load list.
Ingest data from Azure
Ingest data using either MSQ or a native batch parallel task with an Azure input source (azureStorage
) to read objects directly from Azure Blob Storage.
Store segments in Azure
To use Azure for deep storage, set druid.storage.type=azure
.
Configure location
Configure where to store segments using the following properties:
Property | Description | Default |
---|---|---|
druid.azure.account | The Azure Storage account name. | Must be set. |
druid.azure.container | The Azure Storage container name. | Must be set. |
druid.azure.prefix | A prefix string that will be prepended to the blob names for the segments published. | "" |
druid.azure.maxTries | Number of tries before canceling an Azure operation. | 3 |
druid.azure.protocol | The protocol to use to connect to the Azure Storage account. Either http or https . | https |
druid.azure.storageAccountEndpointSuffix | The Storage account endpoint to use. Override the default value to connect to Azure Government or storage accounts with Azure DNS zone endpoints. Do not include the storage account name prefix in this config value. Examples: ABCD1234.blob.storage.azure.net , blob.core.usgovcloudapi.net . | blob.core.windows.net |
Configure authentication
Authenticate access to Azure Blob Storage using one of the following methods:
- SAS token
- Shared Key
- Default Azure credentials chain (
DefaultAzureCredential
).
Configure authentication using the following properties:
Property | Description | Default |
---|---|---|
druid.azure.sharedAccessStorageToken | The SAS (Shared Storage Access) token. | |
druid.azure.key | The Shared Key. | |
druid.azure.useAzureCredentialsChain | If true , use DefaultAzureCredential for authentication. | false |
druid.azure.managedIdentityClientId | To use managed identity authentication in the DefaultAzureCredential , set useAzureCredentialsChain to true and provide the client ID here. |
Persist task logs in Azure
To persist task logs in Azure Blob Storage, set druid.indexer.logs.type=azure
.
Druid stores task logs using the storage account and authentication method configured for storing segments. Use the following configuration to set up where to store the task logs:
Property | Description | Default |
---|---|---|
druid.indexer.logs.container | The Azure Blob Store container to write logs to. | Must be set. |
druid.indexer.logs.prefix | The path to prepend to logs. | Must be set. |
For general options regarding task retention, see Log retention policy.