Apache To Riak

This article explains how to use Fluentd's Riak Output plugin (out_riak) to aggregate semi-structured logs in real-time.


  1. 1.
    An OSX or Linux machine
  2. 2.
    Fluentd is installed (installation guide)
  3. 3.
    Riak is installed
  4. 4.
    An Apache web server log

Installing the Fluentd Riak Output Plugin

The Riak output plugin is used to output data from a Fluentd node to a Riak node.

Rubygems Users

Rubygems users can run the command below to install the plugin:
$ gem install fluent-plugin-riak

td-agent Users

If you are using td-agent, run following command to install the Riak output plugin.
  • td-agent v2: /usr/sbin/td-agent-gem install fluent-plugin-riak
  • td-agent v1: `/usr/lib/fluent/ruby/bin/fluent-gem install

Configuring Fluentd

Create a configuration file called fluent.conf and add the following lines:
@type tail
format apache2
path /var/log/apache2/access_log
pos_file /var/log/fluentd/apache2.access_log.pos
tag riak.apache
<match riak.**>
@type riak
buffer_type memory
flush_interval 5s
retry_limit 5
retry_wait 1s
nodes localhost:8087 # Assumes Riak is running locally on port 8087
The <source>...</source> section tells Fluentd to tail an Apache2-formatted log file located at /var/log/apache2/access_log. Each line is parsed as an Apache access log event and tagged with the riak.apache label.
The <match riak.**>...</match> section tells Fluentd to look for events whose tags start with riak. and send all matches to a Riak node located at localhost:8087. You can send events to multiple nodes by writing nodes host1 host2 host3 instead.


Launch Fluentd with the following command:
$ fluentd -c fluentd.conf
Please confirm that you have the file access permissions to (1) read the Apache log file and (2) write to `/var/log/fluentd/apache2.access_log.pos` (sudo-ing might help).
You should now see data coming into your Riak cluster. We can make sure that everything is running smoothly by hitting Riak's HTTP API:
$ curl http://localhost:8098/buckets/fluentlog/keys?keys=true
{"keys":["2014-01-23-d30b0698-b9de-4290-b8be-a66555497078", ...]}
$ curl http://localhost:8098/buckets/fluentlog/keys/2014-01-23-d30b0698-b9de-4290-b8be-a66555497078
"tag": "riak.apache",
"time": "2004-03-08T01:23:54Z",
"host": "",
"user": null,
"method": "GET",
"path": "/twiki/bin/statistics/Main",
"code": 200,
"size": 808,
"referer": null,
"agent": null
There it is! (the response JSON is formatted for readability)

Learn More

If this article is incorrect or outdated, or omits critical information, please let us know. Fluentd is a open source project under Cloud Native Computing Foundation (CNCF). All components are available under the Apache 2 License.