When we launched Solace PubSub+ Cloud, we started collecting all kinds of metrics so we could understand how people were using it, and be ready when it came time to start billing based on usage
We chose Elasticsearch for storage as we trusted its powerful search capabilities and scalability. In doing so we seriously undervalued its Java API. Although it is generally just a facade for Elasticsearch’s REST API, one particularly clever feature has been giving us the ability to quickly build and modify our metrics microservice without compromising robustness.
When we first started using Elasticsearch, we built queries in a pretty straightforward way:
BoolQueryBuilder() .must(QueryBuilders.termQuery("metricName", "Host")) .must(QueryBuilders.termQuery("metricType", "DiskSpace")) .must(QueryBuilders.termQuery("organizationId", organizationId)) .must(QueryBuilders.rangeQuery("startTime").gte(startTime)) .must(QueryBuilders.rangeQuery("endTime").lte(endTime));
We eventually realized that we usually had to add an organization’s ID and a time range to the query, so we abstracted that out and just required the metric-specific part of the query to be given.
This worked at first, but we didn’t want to have to edit code every time we needed to calculate a new metric or tweak an existing one. This is when we discovered Elasticsearch’s Wrapper Query.
On the surface, this is simple functionality that lets you feed the QueryBuilder object a JSON string, like this:
{ "bool" : { "must" : [ { "terms" : { "metricName" : ["Host"] } }, { "terms" : { "metricType" : ["DiskSpace"] } } ] } }
Which you feed into the QueryBuilder like this:
QueryBuilders.wrapperQuery(json);
The next question is how to augment the query to search across organization IDs and time periods. One obvious approach would be to add a token somewhere in the JSON string to be replaced, but this is where the Elasticsearch API shines.
You may have noticed that the Wrapper Query is just another QueryBuilder, which means you get back a builder that you can simply add parameters to. This lets us reuse most of our abstractions related to adding organization ID and time periods to our metric queries:
BoolQueryBuilder() .must(QueryBuilders.wrapperQuery(json)) .must(QueryBuilders.termQuery("organizationId", organizationId)) .must(QueryBuilders.rangeQuery("startTime").gte(startTime)) .must(QueryBuilders.rangeQuery("endTime").lte(endTime));
And with this, we had our solution. We were able to easily create and update Elasticsearch queries without any code changes while reusing our well-tested abstractions for specifying known search parameters. We were also able to directly use our JSON files as queries to Elasticsearch’s REST API for easier testing and validation.
What do you think of this solution? Are there any other Elasticsearch API features we should have used? We are still learning and love hearing about new features and use cases.