# Apache To Riak

This article explains how to use Fluentd's Riak Output plugin ([out\_riak](https://github.com/kuenishi/fluent-plugin-riak)) to aggregate semi-structured logs in real-time.

![](http://docs.fluentd.org/images/fluentd-riak.png)

## Prerequisites

1. An OSX or Linux machine
2. Fluentd is installed ([installation guide](https://docs.fluentd.org/0.12/quickstart/installation))
3. Riak is installed
4. An Apache web server log

## Installing the Fluentd Riak Output Plugin

The [Riak output plugin](https://github.com/kuenishi/fluent-plugin-riak) is used to output data from a Fluentd node to a Riak node.

### Rubygems Users

Rubygems users can run the command below to install the plugin:

```
$ gem install fluent-plugin-riak
```

### td-agent Users

If you are using td-agent, run following command to install the Riak output plugin.

* td-agent v2: `/usr/sbin/td-agent-gem install fluent-plugin-riak`
* td-agent v1: \`/usr/lib/fluent/ruby/bin/fluent-gem install

  fluent-plugin-riak

## Configuring Fluentd

Create a configuration file called `fluent.conf` and add the following lines:

```
<source>
  @type tail
  format apache2
  path /var/log/apache2/access_log
  pos_file /var/log/fluentd/apache2.access_log.pos
  tag riak.apache
</source>

<match riak.**>
  @type riak
  buffer_type memory
  flush_interval 5s
  retry_limit 5
  retry_wait 1s
  nodes localhost:8087 # Assumes Riak is running locally on port 8087
</match>
```

The `<source>...</source>` section tells Fluentd to tail an Apache2-formatted log file located at `/var/log/apache2/access_log`. Each line is parsed as an Apache access log event and tagged with the `riak.apache` label.

The `<match riak.**>...</match>` section tells Fluentd to look for events whose tags start with `riak.` and send all matches to a Riak node located at `localhost:8087`. You can send events to multiple nodes by writing `nodes host1 host2 host3` instead.

## Testing

Launch Fluentd with the following command:

```
$ fluentd -c fluentd.conf
```

Please confirm that you have the file access permissions to (1) read the Apache log file and (2) write to \`/var/log/fluentd/apache2.access\_log.pos\` (sudo-ing might help).

You should now see data coming into your Riak cluster. We can make sure that everything is running smoothly by hitting Riak's HTTP API:

```
$ curl http://localhost:8098/buckets/fluentlog/keys?keys=true
{"keys":["2014-01-23-d30b0698-b9de-4290-b8be-a66555497078", ...]}
$ curl http://localhost:8098/buckets/fluentlog/keys/2014-01-23-d30b0698-b9de-4290-b8be-a66555497078
[
  {
    "tag": "riak.apache",
    "time": "2004-03-08T01:23:54Z",
    "host": "64.242.88.10",
    "user": null,
    "method": "GET",
    "path": "/twiki/bin/statistics/Main",
    "code": 200,
    "size": 808,
    "referer": null,
    "agent": null
  }
]
```

There it is! (the response JSON is formatted for readability)

## Learn More

* [Fluentd Architecture](https://www.fluentd.org/architecture)
* [Fluentd Get Started](https://docs.fluentd.org/0.12/articles/broken-reference)
* [Riak Output Plugin](http://github.com/kuenishi/fluent-plugin-riak)

If this article is incorrect or outdated, or omits critical information, please [let us know](https://github.com/fluent/fluentd-docs-gitbook/issues?state=open). [Fluentd](http://www.fluentd.org/) is a open source project under [Cloud Native Computing Foundation (CNCF)](https://cncf.io/). All components are available under the Apache 2 License.
