Apache Druid
  • Technology
  • Use Cases
  • Powered By
  • Docs
  • Community
  • Apache
  • Download

โ€บDruid SQL

Getting started

  • Introduction to Apache Druid
  • Quickstart (local)
  • Single server deployment
  • Clustered deployment

Tutorials

  • Load files natively
  • Load files using SQL ๐Ÿ†•
  • Load from Apache Kafka
  • Load from Apache Hadoop
  • Querying data
  • Roll-up
  • Theta sketches
  • Configuring data retention
  • Updating existing data
  • Compacting segments
  • Deleting data
  • Writing an ingestion spec
  • Transforming input data
  • Tutorial: Run with Docker
  • Kerberized HDFS deep storage
  • Convert ingestion spec to SQL
  • Jupyter Notebook tutorials

Design

  • Design
  • Segments
  • Processes and servers
  • Deep storage
  • Metadata storage
  • ZooKeeper

Ingestion

  • Ingestion
  • Data formats
  • Data model
  • Data rollup
  • Partitioning
  • Ingestion spec
  • Schema design tips
  • Stream ingestion

    • Apache Kafka ingestion
    • Apache Kafka supervisor
    • Apache Kafka operations
    • Amazon Kinesis

    Batch ingestion

    • Native batch
    • Native batch: input sources
    • Migrate from firehose
    • Hadoop-based

    SQL-based ingestion ๐Ÿ†•

    • Overview
    • Key concepts
    • API
    • Security
    • Examples
    • Reference
    • Known issues
  • Task reference
  • Troubleshooting FAQ

Data management

  • Overview
  • Data updates
  • Data deletion
  • Schema changes
  • Compaction
  • Automatic compaction

Querying

    Druid SQL

    • Overview and syntax
    • SQL data types
    • Operators
    • Scalar functions
    • Aggregation functions
    • Multi-value string functions
    • JSON functions
    • All functions
    • Druid SQL API
    • JDBC driver API
    • SQL query context
    • SQL metadata tables
    • SQL query translation
  • Native queries
  • Query execution
  • Troubleshooting
  • Concepts

    • Datasources
    • Joins
    • Lookups
    • Multi-value dimensions
    • Nested columns
    • Multitenancy
    • Query caching
    • Using query caching
    • Query context

    Native query types

    • Timeseries
    • TopN
    • GroupBy
    • Scan
    • Search
    • TimeBoundary
    • SegmentMetadata
    • DatasourceMetadata

    Native query components

    • Filters
    • Granularities
    • Dimensions
    • Aggregations
    • Post-aggregations
    • Expressions
    • Having filters (groupBy)
    • Sorting and limiting (groupBy)
    • Sorting (topN)
    • String comparators
    • Virtual columns
    • Spatial filters

Configuration

  • Configuration reference
  • Extensions
  • Logging

Operations

  • Web console
  • Java runtime
  • Security

    • Security overview
    • User authentication and authorization
    • LDAP auth
    • Password providers
    • Dynamic Config Providers
    • TLS support

    Performance tuning

    • Basic cluster tuning
    • Segment size optimization
    • Mixed workloads
    • HTTP compression
    • Automated metadata cleanup

    Monitoring

    • Request logging
    • Metrics
    • Alerts
  • API reference
  • High availability
  • Rolling updates
  • Using rules to drop and retain data
  • Working with different versions of Apache Hadoop
  • Misc

    • dump-segment tool
    • reset-cluster tool
    • insert-segment-to-db tool
    • pull-deps tool
    • Deep storage migration
    • Export Metadata Tool
    • Metadata Migration
    • Content for build.sbt

Development

  • Developing on Druid
  • Creating extensions
  • JavaScript functionality
  • Build from source
  • Versioning
  • Experimental features

Misc

  • Papers

Hidden

  • Apache Druid vs Elasticsearch
  • Apache Druid vs. Key/Value Stores (HBase/Cassandra/OpenTSDB)
  • Apache Druid vs Kudu
  • Apache Druid vs Redshift
  • Apache Druid vs Spark
  • Apache Druid vs SQL-on-Hadoop
  • Authentication and Authorization
  • Broker
  • Coordinator Process
  • Historical Process
  • Indexer Process
  • Indexing Service
  • MiddleManager Process
  • Overlord Process
  • Router Process
  • Peons
  • Approximate Histogram aggregators
  • Apache Avro
  • Microsoft Azure
  • Bloom Filter
  • DataSketches extension
  • DataSketches HLL Sketch module
  • DataSketches Quantiles Sketch module
  • DataSketches Theta Sketch module
  • DataSketches Tuple Sketch module
  • Basic Security
  • Kerberos
  • Cached Lookup Module
  • Apache Ranger Security
  • Google Cloud Storage
  • HDFS
  • Apache Kafka Lookups
  • Globally Cached Lookups
  • MySQL Metadata Store
  • ORC Extension
  • Druid pac4j based Security extension
  • Apache Parquet Extension
  • PostgreSQL Metadata Store
  • Protobuf
  • S3-compatible
  • Simple SSLContext Provider Module
  • Stats aggregator
  • Test Stats Aggregators
  • Druid AWS RDS Module
  • Kubernetes
  • Ambari Metrics Emitter
  • Apache Cassandra
  • Rackspace Cloud Files
  • DistinctCount Aggregator
  • Graphite Emitter
  • InfluxDB Line Protocol Parser
  • InfluxDB Emitter
  • Kafka Emitter
  • Materialized View
  • Moment Sketches for Approximate Quantiles module
  • Moving Average Query
  • OpenTSDB Emitter
  • Druid Redis Cache
  • Microsoft SQLServer
  • StatsD Emitter
  • T-Digest Quantiles Sketch module
  • Thrift
  • Timestamp Min/Max aggregators
  • GCE Extensions
  • Aliyun OSS
  • Prometheus Emitter
  • kubernetes
  • Cardinality/HyperUnique aggregators
  • Select
  • Firehose (deprecated)
  • Native batch (simple)
  • Realtime Process
Edit

Druid SQL API

Apache Druid supports two query languages: Druid SQL and native queries. This document describes the SQL language.

You can submit and cancel Druid SQL queries using the Druid SQL API. The Druid SQL API is available at https://ROUTER:8888/druid/v2/sql, where ROUTER is the IP address of the Druid Router.

Submit a query

To use the SQL API to make Druid SQL queries, send your query to the Router using the POST method:

POST https://ROUTER:8888/druid/v2/sql/

Submit your query as the value of a "query" field in the JSON object within the request payload. For example:

{"query" : "SELECT COUNT(*) FROM data_source WHERE foo = 'bar'"}

Request body

PropertyDescriptionDefault
querySQL query string.none (required)
resultFormatFormat of query results. See Responses for details."object"
headerWhether or not to include a header row for the query result. See Responses for details.false
typesHeaderWhether or not to include type information in the header. Can only be set when header is also true. See Responses for details.false
sqlTypesHeaderWhether or not to include SQL type information in the header. Can only be set when header is also true. See Responses for details.false
contextJSON object containing SQL query context parameters.{} (empty)
parametersList of query parameters for parameterized queries. Each parameter in the list should be a JSON object like {"type": "VARCHAR", "value": "foo"}. The type should be a SQL type; see Data types for a list of supported SQL types.[] (empty)

You can use curl to send SQL queries from the command-line:

$ cat query.json
{"query":"SELECT COUNT(*) AS TheCount FROM data_source"}

$ curl -XPOST -H'Content-Type: application/json' http://ROUTER:8888/druid/v2/sql/ -d @query.json
[{"TheCount":24433}]

There are a variety of SQL query context parameters you can provide by adding a "context" map, like:

{
  "query" : "SELECT COUNT(*) FROM data_source WHERE foo = 'bar' AND __time > TIMESTAMP '2000-01-01 00:00:00'",
  "context" : {
    "sqlTimeZone" : "America/Los_Angeles"
  }
}

Parameterized SQL queries are also supported:

{
  "query" : "SELECT COUNT(*) FROM data_source WHERE foo = ? AND __time > ?",
  "parameters": [
    { "type": "VARCHAR", "value": "bar"},
    { "type": "TIMESTAMP", "value": "2000-01-01 00:00:00" }
  ]
}

Metadata is available over HTTP POST by querying metadata tables.

Responses

Result formats

Druid SQL's HTTP POST API supports a variety of result formats. You can specify these by adding a "resultFormat" parameter, like:

{
  "query" : "SELECT COUNT(*) FROM data_source WHERE foo = 'bar' AND __time > TIMESTAMP '2000-01-01 00:00:00'",
  "resultFormat" : "array"
}

To request a header with information about column names, set header to true in your request. When you set header to true, you can optionally include typesHeader and sqlTypesHeader as well, which gives you information about Druid runtime and SQL types respectively. You can request all these headers with a request like:

{
  "query" : "SELECT COUNT(*) FROM data_source WHERE foo = 'bar' AND __time > TIMESTAMP '2000-01-01 00:00:00'",
  "resultFormat" : "array",
  "header" : true,
  "typesHeader" : true,
  "sqlTypesHeader" : true
}

The following table shows supported result formats:

FormatDescriptionHeader descriptionContent-Type
objectThe default, a JSON array of JSON objects. Each object's field names match the columns returned by the SQL query, and are provided in the same order as the SQL query.If header is true, the first row is an object where the fields are column names. Each field's value is either null (if typesHeader and sqlTypesHeader are false) or an object that contains the Druid type as type (if typesHeader is true) and the SQL type as sqlType (if sqlTypesHeader is true).application/json
arrayJSON array of JSON arrays. Each inner array has elements matching the columns returned by the SQL query, in order.If header is true, the first row is an array of column names. If typesHeader is true, the next row is an array of Druid types. If sqlTypesHeader is true, the next row is an array of SQL types.application/json
objectLinesLike object, but the JSON objects are separated by newlines instead of being wrapped in a JSON array. This can make it easier to parse the entire response set as a stream, if you do not have ready access to a streaming JSON parser. To make it possible to detect a truncated response, this format includes a trailer of one blank line.Same as object.text/plain
arrayLinesLike array, but the JSON arrays are separated by newlines instead of being wrapped in a JSON array. This can make it easier to parse the entire response set as a stream, if you do not have ready access to a streaming JSON parser. To make it possible to detect a truncated response, this format includes a trailer of one blank line.Same as array, except the rows are separated by newlines.text/plain
csvComma-separated values, with one row per line. Individual field values may be escaped by being surrounded in double quotes. If double quotes appear in a field value, they will be escaped by replacing them with double-double-quotes like ""this"". To make it possible to detect a truncated response, this format includes a trailer of one blank line.Same as array, except the lists are in CSV format.text/csv

If typesHeader is set to true, Druid type information is included in the response. Complex types, like sketches, will be reported as COMPLEX<typeName> if a particular complex type name is known for that field, or as COMPLEX if the particular type name is unknown or mixed. If sqlTypesHeader is set to true, SQL type information is included in the response. It is possible to set both typesHeader and sqlTypesHeader at once. Both parameters require that header is also set.

To aid in building clients that are compatible with older Druid versions, Druid returns the HTTP header X-Druid-SQL-Header-Included: yes if header was set to true and if the version of Druid the client is connected to understands the typesHeader and sqlTypesHeader parameters. This HTTP response header is present irrespective of whether typesHeader or sqlTypesHeader are set or not.

Druid returns the SQL query identifier in the X-Druid-SQL-Query-Id HTTP header. This query id will be assigned the value of sqlQueryId from the query context parameters if specified, else Druid will generate a SQL query id for you.

Errors

Errors that occur before the response body is sent will be reported in JSON, with an HTTP 500 status code, in the same format as native Druid query errors. If an error occurs while the response body is being sent, at that point it is too late to change the HTTP status code or report a JSON error, so the response will simply end midstream and an error will be logged by the Druid server that was handling your request.

As a caller, it is important that you properly handle response truncation. This is easy for the object and array formats, since truncated responses will be invalid JSON. For the line-oriented formats, you should check the trailer they all include: one blank line at the end of the result set. If you detect a truncated response, either through a JSON parsing error or through a missing trailing newline, you should assume the response was not fully delivered due to an error.

Cancel a query

You can use the HTTP DELETE method to cancel a SQL query on either the Router or the Broker. When you cancel a query, Druid handles the cancellation in a best-effort manner. It marks the query canceled immediately and aborts the query execution as soon as possible. However, your query may run for a short time after your cancellation request.

Druid SQL's HTTP DELETE method uses the following syntax:

DELETE https://ROUTER:8888/druid/v2/sql/{sqlQueryId}

The DELETE method requires the sqlQueryId path parameter. To predict the query id you must set it in the query context. Druid does not enforce unique sqlQueryId in the query context. If you issue a cancel request for a sqlQueryId active in more than one query context, Druid cancels all requests that use the query id.

For example if you issue the following query:

curl --request POST 'https://ROUTER:8888/druid/v2/sql' \
--header 'Content-Type: application/json' \
--data-raw '{"query" : "SELECT sleep(CASE WHEN sum_added > 0 THEN 1 ELSE 0 END) FROM wikiticker WHERE sum_added > 0 LIMIT 15",
"context" : {"sqlQueryId" : "myQuery01"}}'

You can cancel the query using the query id myQuery01 as follows:

curl --request DELETE 'https://ROUTER:8888/druid/v2/sql/myQuery01' \

Cancellation requests require READ permission on all resources used in the sql query.

Druid returns an HTTP 202 response for successful deletion requests.

Druid returns an HTTP 404 response in the following cases:

  • sqlQueryId is incorrect.
  • The query completes before your cancellation request is processed.

Druid returns an HTTP 403 response for authorization failure.

โ† All functionsJDBC driver API โ†’
  • Submit a query
    • Request body
    • Responses
  • Cancel a query

Technologyโ€‚ยทโ€‚Use Casesโ€‚ยทโ€‚Powered by Druidโ€‚ยทโ€‚Docsโ€‚ยทโ€‚Communityโ€‚ยทโ€‚Downloadโ€‚ยทโ€‚FAQ

โ€‚ยทโ€‚โ€‚ยทโ€‚โ€‚ยทโ€‚
Copyright ยฉ 2022 Apache Software Foundation.
Except where otherwise noted, licensed under CC BY-SA 4.0.
Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.