Mittwoch, 23. März 2016

Logging Requests to Elasticsearch

This is something I wanted to write down for years but never got down to completing the post. It can help you a lot with certain Elasticsearch setups by answering two questions using the slow log.

  • Is my application talking to Elasticsearch?
  • What kind of queries are being built by my application?

A while ago I helped a colleague on one of my current projects to debug some problems with Elasticsearch integrated into proprietary software. He was not sure if there are any requests arriving at Elasticsearch and what those look like. We activated the slow log for Elasticsearch, which not only can be used to log the slow queries but also to enable debugging for any queries that reach Elasticsearch.

The slow log, as the name suggests, is there to log slow requests. As slow is a subjective term you can define thresholds that need to be passed. For example you can define that any queries slower than 50ms are logged in the debug level but any queries that take longer than 500ms in the warn level.

Slow queries can be configured for both phases of the query execution: query and fetch. In the query phase only the ids of the documents are retrieved in the form of a search result list. The fetch phase is where the result documents are retrieved.

Besides the slow query log there is also the slow index log which can be used in the same way but measures the time for indexing.

Both of these settings are index settings. That means they are configured for each index and can therefore be different across indices.

Instance Settings

There are multiple places where you can configure index settings. The first is config/elasticsearch.yml that contains the configuration of the instance. For older versions of Elasticsearch it already contains the lines that are commented out, in newer versions you need to include them yourself. If you want to log all requests at debug level you can just add the following lines and set a threshold of 0s.

index.search.slowlog.threshold.query.debug: 0s
index.search.slowlog.threshold.fetch.debug: 0s
index.indexing.slowlog.threshold.index.debug: 0s

You need to reboot the instance so that the settings are activated. Any indexing and search requests will now be logged to separate log file in the log folder. With the default configuration the logs will be at logs/elasticsearch_index_indexing_slowlog.log and logs/elasticsearch_index_search_slowlog.log. The query log will now contain entries like this:

[2016-03-23 06:43:47,231][DEBUG][index.search.slowlog.fetch] took[5.8ms], took_millis[5], types[talk], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"query":{"match":{"tags":"Java"}}}], extra_source[]

If you are testing this with multiple shards on one instance you might get more log lines than expected: There will be one line for every shard in the query phase and one line for the fetch phase.

Runtime Settings

Besides the setting in elasticsearch.yml the slow request logs can also be activated using the HTTP API which doesn't require a reboot of the instance and is therefore really well suited for debugging production issues. The following request changes the setting for the query log for an index conference.

curl -XPUT "http://localhost:9200/conference/_settings" -d'
{
    "index.search.slowlog.threshold.query.debug": "0s"
}'

When you are done debugging your issue you can just set a higher threshold again.