Writing Aggregationsedit

NEST allows you to write your aggregations using

  • a strict fluent DSL
  • a verbatim object initializer syntax that maps verbatim to the Elasticsearch API
  • a more terse object initializer aggregation DSL

Three different ways, yikes that’s a lot to take in! Lets go over them one by one and explain when you might want to use each.

This is the json output for each example

Example json output. 

{
  "aggs": {
    "name_of_child_agg": {
      "children": {
        "type": "commits"
      },
      "aggs": {
        "average_per_child": {
          "avg": {
            "field": "confidenceFactor"
          }
        },
        "max_per_child": {
          "max": {
            "field": "confidenceFactor"
          }
        },
        "min_per_child": {
          "min": {
            "field": "confidenceFactor"
          }
        }
      }
    }
  }
}

Fluent DSLedit

The fluent lambda syntax is the most terse way to write aggregations. It benefits from types that are carried over to sub aggregations

s => s
.Aggregations(aggs => aggs
    .Children<CommitActivity>("name_of_child_agg", child => child
        .Aggregations(childAggs => childAggs
            .Average("average_per_child", avg => avg.Field(p => p.ConfidenceFactor))
            .Max("max_per_child", avg => avg.Field(p => p.ConfidenceFactor))
            .Min("min_per_child", avg => avg.Field(p => p.ConfidenceFactor))
        )
    )
)

Object Initializer syntaxedit

The object initializer syntax (OIS) is a one-to-one mapping with how aggregations have to be represented in the Elasticsearch API. While it has the benefit of being a one-to-one mapping, being dictionary based in C# means it can grow exponentially in complexity rather quickly.

new SearchRequest<Project>
{
    Aggregations = new ChildrenAggregation("name_of_child_agg", typeof(CommitActivity))
    {
        Aggregations =
            new AverageAggregation("average_per_child", "confidenceFactor")
            && new MaxAggregation("max_per_child", "confidenceFactor")
            && new MinAggregation("min_per_child", "confidenceFactor")
    }
}

Terse Object Initializer DSLedit

For this reason the OIS syntax can be shortened dramatically by using *Agg related family, These allow you to forego introducing intermediary Dictionaries to represent the aggregation DSL. It also allows you to combine multiple aggregations using bitwise AND &&) operator.

Compare the following example with the previous vanilla OIS syntax

new SearchRequest<Project>
{
    Aggregations = new ChildrenAggregation("name_of_child_agg", typeof(CommitActivity))
    {
        Aggregations =
            new AverageAggregation("average_per_child", Field<CommitActivity>(p => p.ConfidenceFactor))
            && new MaxAggregation("max_per_child", Field<CommitActivity>(p => p.ConfidenceFactor))
            && new MinAggregation("min_per_child", Field<CommitActivity>(p => p.ConfidenceFactor))
    }
}

Aggregating over a collection of aggregationsedit

An advanced scenario may involve an existing collection of aggregation functions that should be set as aggregations on the request. Using LINQ’s .Aggregate() method, each function can be applied to the aggregation descriptor childAggs below) in turn, returning the descriptor after each function application.

var aggregations = new List<Func<AggregationContainerDescriptor<CommitActivity>, IAggregationContainer>> 
{
    a => a.Average("average_per_child", avg => avg.Field(p => p.ConfidenceFactor)),
    a => a.Max("max_per_child", avg => avg.Field(p => p.ConfidenceFactor)),
    a => a.Min("min_per_child", avg => avg.Field(p => p.ConfidenceFactor))
};
return s => s
    .Aggregations(aggs => aggs
        .Children<CommitActivity>("name_of_child_agg", child => child
            .Aggregations(childAggs =>
                aggregations.Aggregate(childAggs, (acc, agg) => { agg(acc); return acc; }) 
            )
        )
    );

a list of aggregation functions to apply

Using LINQ’s Aggregate() function to accumulate/apply all of the aggregation functions

Combining multipe AggregationDescriptor's is also possible using the bitwise & operator

Fluent DSL Exampleedit

var aggregations = new AggregationContainerDescriptor<CommitActivity>()
    .Average("average_per_child", avg => avg.Field(p => p.ConfidenceFactor))
    .Max("max_per_child", avg => avg.Field(p => p.ConfidenceFactor))
    && new AggregationContainerDescriptor<CommitActivity>()
        .Min("min_per_child", avg => avg.Field(p => p.ConfidenceFactor));
return s => s
    .Aggregations(aggs => aggs
        .Children<CommitActivity>("name_of_child_agg", child => child
            .Aggregations(childAggs => aggregations)
        )
    );

Aggs vs. Aggregationsedit

The response exposes both .Aggregations and .Aggs properties for handling aggregations. Why two properties you ask? Well, the former is a dictionary of aggregation names to IAggregate types, a common interface for aggregation responses (termed Aggregates in NEST), and the latter is a convenience helper to get the right type of aggregation response out of the dictionary based on a key name.

This is better illustrated with an example

Let’s imagine we make the following request.

s => s
.Aggregations(aggs => aggs
    .Children<CommitActivity>("name_of_child_agg", child => child
        .Aggregations(childAggs => childAggs
            .Average("average_per_child", avg => avg.Field(p => p.ConfidenceFactor))
            .Max("max_per_child", avg => avg.Field(p => p.ConfidenceFactor))
            .Min("min_per_child", avg => avg.Field(p => p.ConfidenceFactor))
        )
    )
)

Aggs usageedit

Now, using .Aggs, we can easily get the Children aggregation response out and from that, the Average and Max sub aggregations.

response.IsValid.Should().BeTrue();

var childAggregation = response.Aggs.Children("name_of_child_agg");

var averagePerChild = childAggregation.Average("average_per_child");

averagePerChild.Should().NotBeNull(); 

var maxPerChild = childAggregation.Max("max_per_child");

maxPerChild.Should().NotBeNull(); 

Do something with the average per child. Here we just assert it’s not null

Do something with the max per child. Here we just assert it’s not null