by Dyana Rose on 02 Jul 2018

Moonfire is an open source Java library for crafting Druid.io queries.

It provides strongly typed query models and result models, enabling you to build up a query and serialise it to the required JSON and to then deserialise Druid’s response into an easily usable POJO.

Without Moonfire, dynamically crafting a Druid query or using response data would mean either hand-rolling a fairly complicated JSON structure, or rolling your own serialisable classes.

How to build a query with Moonfire

In the Druid documentation several examples of queries are given, for example the Group By Query.

For this query, the JSON that you post to the broker node is

  "queryType": "groupBy",
  "dataSource": "sample_datasource",
  "granularity": "day",
  "dimensions": ["country", "device"],
  "limitSpec": { "type": "default", "limit": 5000, "columns": ["country", "data_transfer"] },
  "filter": {
    "type": "and",
    "fields": [
      { "type": "selector", "dimension": "carrier", "value": "AT&T" },
      { "type": "or",
        "fields": [
          { "type": "selector", "dimension": "make", "value": "Apple" },
          { "type": "selector", "dimension": "make", "value": "Samsung" }
  "aggregations": [
    { "type": "longSum", "name": "total_usage", "fieldName": "user_count" },
    { "type": "doubleSum", "name": "data_transfer", "fieldName": "data_transfer" }
  "postAggregations": [
    { "type": "arithmetic",
      "name": "avg_usage",
      "fn": "/",
      "fields": [
        { "type": "fieldAccess", "fieldName": "data_transfer" },
        { "type": "fieldAccess", "fieldName": "total_usage" }
  "intervals": [ "2012-01-01T00:00:00.000/2012-01-03T00:00:00.000" ],
  "having": {
    "type": "greaterThan",
    "aggregation": "total_usage",
    "value": 100

This query will:

  • use the sample_datasource datasource (dataSource)
  • group by country and device (dimensions)
  • order by country and the data_transfer aggregation result with a limit of top 5000 (limitSpec)
  • filter for rows where the carrier is AT&T and the make is either Apple or Samsung (filter)
  • sum the user_count input column into the total_usage output column (aggregations)
  • sum the data_transfer column into the data_transfer output column (aggregations)
  • calculate the average usage (postAggregations)
  • only return results where the total_usage column is greater than 100

This is a rather simple query to describe, but the JSON that specifies it is somewhat complicated.

With Moonfire, you can create this same query using strongly typed classes:

List<String> intervals = Collections.singletonList("2012-01-01T00:00:00.000/2012-01-03T00:00:00.000");
Granularity granularity = Granularity.day;
List<DimensionSpec> dimensionSpecs = new ArrayList<DimensionSpec>() ;
LimitSpec limitSpec = new DefaultLimitSpec().setLimit(500).setColumns(new ArrayList<OrderByColumnSpec>() );
List<Aggregation> aggregations = new ArrayList<Aggregation>() ;
List<PostAggregation> fields = new ArrayList<PostAggregation>() ;
List<PostAggregation> postAggregations = new ArrayList<PostAggregation>() ;
Filter filter = new AndFilter()
        .addField(new SelectorFilter().setDimension("carrier").setValue("AT&T"))
        .addField(new OrFilter()
                .addField(new SelectorFilter().setDimension("make").setValue("Apple"))
                .addField(new SelectorFilter().setDimension("make").setValue("Samsung"))
HavingSpec havingSpec = new GreaterThanHavingSpec().setAggregation("total_usage").setValue(100);

GroupByQuery query = new GroupByQuery("sample_datasource", intervals, granularity, dimensionSpecs)

Once you have the query object, you will be able to serialise it to valid JSON using the JSON serialiser of your choice.

Examples of all available query types are in the README.md and are also reflected in the test suite.





compile group: 'com.salecycle', name: 'moonfire', version: '0.2.3'


Found a bug, want to contribute, or just want to have a look over the source code? The project is available on GitHub at https://github.com/SaleCycle/druid-moonfire