Donnerstag, 23. Juli 2015

ActiveMQ as a Message Broker for Logstash

When scaling Logstash it is common to add a message broker that is used to temporarily buffer incoming messages before they are being processed by one or more Logstash nodes. Data is pushed to the brokers either through a shipper like Beaver that reads logfiles and sends each event to the broker. Alternatively the application can send the log events directly using something like a Log4j appender.

A common option is to use Redis as a broker that stores the data in memory but using other options like Apache Kafka is also possible. Sometimes organizations are not that keen to introduce lots of new technology and want to reuse existing stores. ActiveMQ is a widely used messaging and integration platform that supports different protocols and looks just perfect for the use as a message broker. Let's see the options to integrate it.

Setting up ActiveMQ

ActiveMQ can easily be set up using the scripts that ship with it. On Linux it's just a matter of executing ./activemq console. Using the admin console at http://127.0.0.1:8161/admin/ you can create new queues and even enqueue messages for testing.

Consuming messages with AMQP

An obvious way to try to connect ActiveMQ to Logstash is using AMQP, the Advanced Message Queuing Protocol. It's a standard protocol that is supported by different messaging platforms.

There used to be a Logstash input for AMQP but unfortunately it has been renamed to rabbitmq-input because RabbitMQ is the main system that is supported.

Let's see what happens if we try to use the input with ActiveMQ.

input {
    rabbitmq {
        host => "localhost"
        queue => "TestQueue"
        port => 5672
    }
}

output {
    stdout {
        codec => "rubydebug"
    }
}

We tell Logstash to listen on localhost on the standard port on a queue named TestQueue. The result should just be dumped to the standard output. Unfortunately Logstash only issues errors because it can't connect.

Logstash startup completed
RabbitMQ connection error: . Will reconnect in 10 seconds... {:level=>:error}

In the ActiveMQ logs we can see that our parameters are correct but unfortunately both systems seem to speak different dialects of AMQP.

 WARN | Connection attempt from non AMQP v1.0 client. AMQP,0,0,9,1
org.apache.activemq.transport.amqp.AmqpProtocolException: Connection from client using unsupported AMQP attempted
...

So bad luck with this option.

Consuming messages with STOMP

The aptly named Simple Text Oriented Messaging Protocol is another option that is supported by ActiveMQ. Fortunately there is a dedicated input for it. It is not included in Logstash by default but can be installed easily.

bin/plugin install logstash-input-stomp

Afterwards we can just use it in our Logstash config.

input {
    stomp {
        host => "localhost"
        destination => "TestQueue"
    }
}

output {
    stdout {
        codec => "rubydebug"
    }
}

This time we are better off: Logstash really can connect and dumps our message to the standard output.

bin/logstash --config stomp.conf 
Logstash startup completed
{
       "message" => "Can I kick it...",
      "@version" => "1",
    "@timestamp" => "2015-07-22T05:42:35.016Z"
}

Consuming messages with JMS

Though the stomp-input works there is even another option that is not released yet but can already be tested: jms-input supports the Java Messaging System, the standard way of doing messaging on the JVM.

Currently you need to build the plugin yourself (which didn't work on my machine but should be caused by my outdated local jruby installation).

Getting data in ActiveMQ

Now that we know of ways to consume data from ActiveMQ it is time to think about how to get data in. When using Java you can use something like a Log4j- or Logback-Appender that push the log events directly to the queue using JMS.

When it comes to shipping data unfortunately none of the more popular solutions seems to be able to push data to ActiveMQ. If you know of any solution that can be used it would be great if you could leave a comment.

All in all I think it can be possible to use ActiveMQ as a broker for Logstash but it might require some more work when it comes to shipping data.

Freitag, 6. Februar 2015

Fixing Elasticsearch Allocation Issues

Last week I was working with some Logstash data on my laptop. There are around 350 indices that contain the logstash data and an index that holds the metadata for Kibana 4. When trying to start the single node cluster I have to wait a while, until all indices are available. Some APIs can be used to see the progress of the startup process.

The cluster health API gives general information about the state of the cluster and indicates if the cluster health is green, yellow or red. After a while the number of unassigned shards didn't change anymore but the cluster still stayed in a red state.

curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "elasticsearch",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 1850,
  "active_shards" : 1850,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 1852
}

One shard couldn't be recovered: 1850 were ok but it should have been 1851. To see the problem we can use the cat indices command that will show us all indices and their health.

curl http://localhost:9200/_cat/indices
[...]
yellow open logstash-2014.02.16 5 1 1184 0   1.5mb   1.5mb 
red    open .kibana             1 1                        
yellow open logstash-2014.06.03 5 1 1857 0     2mb     2mb 
[...]

The .kibana index didn't turn yellow. It only consists of one primary shard that couldn't be allocated.

Restarting the node and closing and opening the index didn't help. Looking at elasticsearch-kopf I could see that primary and replica shards both were unassingned (You need to tick the checkbox that says hide special to see the index).

Fortunately there is a way to bring the cluster in a yellow state again. We can manually allocate the primary shard on our node.

Elasticsearch provides the Cluster Reroute API that can be used to allocate a shard on a node. When trying to allocate the shard of the index .kibana I first got an exception.

curl -XPOST "http://localhost:9200/_cluster/reroute" -d'
{
    "commands" : [ {
          "allocate" : {
              "index" : ".kibana", "shard" : 0, "node" : "Jebediah Guthrie"
          }
        }
    ]
}'

[2015-01-30 13:35:47,848][DEBUG][action.admin.cluster.reroute] [Jebediah Guthrie] failed to perform [cluster_reroute (api)]
org.elasticsearch.ElasticsearchIllegalArgumentException: [allocate] trying to allocate a primary shard [.kibana][0], which is disabled
Fortunately the message already tells us the problem: By default you are not allowed to allocate primary shards due to the danger of losing data. If you'd like to allocate a primary shard you need to tell it Elasticsearch explicitly by setting the property allow_primary.
curl -XPOST "http://localhost:9200/_cluster/reroute" -d'
{
    "commands" : [ {
          "allocate" : {
              "index" : ".kibana", "shard" : 0, "node" : "Jebediah Guthrie", "allow_primary": "true"
          }
        }
    ]
}'

For me this helped and my shard got reallocated and the cluster health turned yellow.

I am not sure what caused the problems but it is very likely related to the way I am working locally. I am regularly sending my laptop to sleep which is something you never do on a server. Nevertheless I have seen this problem a few times locally which justifies writing down the necessary steps to fix it.

Freitag, 23. Januar 2015

Logging to Redis using Spring Boot and Logback

When doing centralized logging, e.g. using Elasticsearch, Logstash and Kibana or Graylog2 you have several options available for your Java application. You can either write your standard application logs and parse those using Logstash, either consumed directly or shipped to another machine using something like logstash-forwarder. Alternatively you can write in a more appropriate format like JSON directly so the processing step doesn't need that much work for parsing your messages. As a third option is to write to a different data store directly which acts as a buffer for your log messages. In this post we are looking at how we can configure Logback in a Spring Boot application to write the log messages to Redis directly.

Redis

We are using Redis as a log buffer for our messages. Not everyone is happy with Redis but it is a common choice. Redis stores its content in memory which makes it well suited for fast access but can also sync it to disc when necessary. A special feature of Redis is that the values can be different data types like strings, lists or sets. Our application uses a single key and value pair where the key is the name of the application and the value is a list that contains all our log messages. This way we can handle several logging applications in one Redis instance.

When testing your setup you might also want to look into the data that is stored in Redis. You can access it using the redis-cli client. I collected some useful commands for validating your log messages are actually written to Redis.

CommandDescription
KEYS *Show all keys in this Redis instance
LLEN keyShow the number of messages in the list for key
LRANGE key 0 100Show the first 100 messages in the list for key

The Logback Config

When working with Logback most of the time an XML file is used for all the configuration. Appenders are the things that send the log output somewhere. Loggers are used to set log levels and attach appenders to certain pieces of the application.

For Spring Boot Logback is available for any application that uses the spring-boot-starter-logging which is also a dependency of the common spring-boot-starter-web. The configuration can be added to a file called logback.xml that resides in src/main/resources.

Spring boot comes with a file and a console appender that are already configured correctly. We can include the base configuration in our file to keep all the predefined configurations.

For logging to Redis we need to add another appender. A good choice is the logback-redis-appender that is rather lightweight and uses the Java client Jedis. The log messages are written to Redis in JSON directly so it's a perfect match for something like logstash. We can make Spring Boot log to a local instance of Redis by using the following configuration.

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <include resource="org/springframework/boot/logging/logback/base.xml"/>
    <appender name="LOGSTASH" class="com.cwbase.logback.RedisAppender">
        <host>localhost</host>
        <port>6379</port>
        <key>my-spring-boot-app</key>
    </appender>
    <root level="INFO">
        <appender-ref ref="LOGSTASH" />
        <appender-ref ref="CONSOLE" />
        <appender-ref ref="FILE" />
    </root>
</configuration>

We configure an appender named LOGSTASH that is an instance of RedisAppender. Host and port are set for a local Redis instance, key identifies the Redis key that is used for our logs. There are more options available like the interval to push log messages to Redis. Explore the readme of the project for more information.

Spring Boot Dependencies

To make the logging work we of course have to add a dependency to the logback-redis-appender to our pom. Depending on your Spring Boot version you might see some errors in your log file that methods are missing.

This is because Spring Boot manages the dependencies it uses internally and the versions for jedis and commons-pool2 do not match the ones that we need. If this happens we can configure the versions to use in the properties section of our pom.

<properties>
    <commons-pool2.version>2.0</commons-pool2.version>
    <jedis.version>2.5.2</jedis.version>
</properties>

Now the application will start and you can see that it sends the log messages to Redis as well.

Enhancing the Configuration

Having the host and port configured in the logback.xml is not the best thing to do. When deploying to another environment with different settings you have to change the file or deploy a custom one.

The Spring Boot integration of Logback allows to set some of the configuration options like the file to log to and the log levels using the main configuration file application.properties. Unfortunately this is a special treatment for some values and you can't add custom values as far as I could see.

But fortunately Logback supports the use of environment variables so we don't have to rely on configuration files. Having set the environment variables REDIS_HOST and REDIS_PORT you can use the following configuration for your appender.

    <appender name="LOGSTASH" class="com.cwbase.logback.RedisAppender">
        <host>${REDIS_HOST}</host>
        <port>${REDIS_PORT}</port>
        <key>my-spring-boot-app</key>
    </appender>

We can even go one step further. To only activate the appender when the property is set you can add conditional processing to your configuration.

    <if condition='isDefined("REDIS_HOST") &amp;&amp; isDefined("REDIS_PORT")'>
        <then>
            <appender name="LOGSTASH" class="com.cwbase.logback.RedisAppender">
                <host>${REDIS_HOST}</host>
                <port>${REDIS_PORT}</port>
                <key>my-spring-boot-app</key>
            </appender>
        </then>
    </if>

You can use a Java expression for deciding if the block should be evaluated. When the appender is not available Logback will just log an error and uses any other appenders that are configured. For this to work you need to add the Janino library to your pom.

Now the appender is activated based on the environment variables. If you like you can skip the setup for local development and only set the variables on production systems.

Conclusion

Getting started with Spring Boot or logging to Redis alone is very easy but some of the details are some work to get right. But it's worth the effort: Once you get used to centralized logging you don't want to have your systems running without it anymore.

Elasticsearch - Der praktische Einstieg
Java Code Geeks