Fluentd
0.12
0.12
  • Introduction
  • Overview
    • Getting Started
    • Installation
    • Life of a Fluentd event
    • Support
    • FAQ
  • Use Cases
    • Centralized App Logging
    • Monitoring Service Logs
    • Data Analytics
    • Connecting to Data Storages
    • Stream Processing
    • Windows Event Collection
    • IoT Data Logger
  • Configuration
    • Config File Syntax
    • Routing Examples
    • Recipes
  • Deployment
    • Logging
    • Monitoring
    • Signals
    • RPC
    • High Availability Config
    • Failure Scenarios
    • Performance Tuning
    • Plugin Management
    • Trouble Shooting
    • Secure Forwarding
    • Fluentd UI
    • Command Line Option
  • Container Deployment
    • Docker Image
    • Docker Logging Driver
    • Docker Compose
    • Kubernetes
  • Input Plugins
    • tail
    • forward
    • secure_forward
    • udp
    • tcp
    • http
    • unix
    • syslog
    • exec
    • scribe
    • multiprocess
    • dummy
    • Others
  • Output Plugins
    • file
    • s3
    • kafka
    • forward
    • secure_forward
    • exec
    • exec_filter
    • copy
    • geoip
    • roundrobin
    • stdout
    • null
    • webhdfs
    • splunk
    • mongo
    • mongo_replset
    • relabel
    • rewrite_tag_filter
    • Others
  • Buffer Plugins
    • memory
    • file
  • Filter Plugins
    • record_transformer
    • grep
    • parser
    • stdout
  • Parser Plugins
    • regexp
    • apache2
    • apache_error
    • nginx
    • syslog
    • ltsv
    • csv
    • tsv
    • json
    • multiline
    • none
  • Formatter Plugins
    • out_file
    • json
    • ltsv
    • csv
    • msgpack
    • hash
    • single_value
  • Developer
    • Plugin Development
    • Community
    • Mailing List
    • Source Code
    • Bug Tracking
    • ChangeLog
    • Logo
  • Articles
    • Store Apache Logs into MongoDB
    • Apache To Riak
    • Store Apache Logs into Amazon S3
    • Before Install
    • Cep Norikra
    • Collect Glusterfs Logs
    • Common Log Formats
    • Docker Logging Efk Compose
    • Docker Logging
    • Filter Modify Apache
    • Forwarding Over Ssl
    • Free Alternative To Splunk By Fluentd
    • Data Collection to Hadoop (HDFS)
    • Data Analytics with Treasure Data
    • Install By Chef
    • Install By Deb
    • Install By Dmg
    • Install By Gem
    • Install By Rpm
    • Install From Source
    • Install On Beanstalk
    • Install On Heroku
    • Java
    • Kinesis Stream
    • Kubernetes Fluentd
    • Monitoring by Prometheus
    • Monitoring by Rest Api
    • Nodejs
    • Performance Tuning Multi Process
    • Performance Tuning Single Process
    • Perl
    • Php
    • Python
    • Quickstart
    • Raspberrypi Cloud Data Logger
    • Recipe Apache Logs To Elasticsearch
    • Recipe Apache Logs To Mongo
    • Recipe Apache Logs To S3
    • Recipe Apache Logs To Treasure Data
    • Recipe Cloudstack To Mongodb
    • Recipe Csv To Elasticsearch
    • Recipe Csv To Mongo
    • Recipe Csv To S3
    • Recipe Csv To Treasure Data
    • Recipe Http Rest Api To Elasticsearch
    • Recipe Http Rest Api To Mongo
    • Recipe Http Rest Api To S3
    • Recipe Http Rest Api To Treasure Data
    • Recipe Json To Elasticsearch
    • Recipe Json To Mongo
    • Recipe Json To S3
    • Recipe Json To Treasure Data
    • Recipe Nginx To Elasticsearch
    • Recipe Nginx To Mongo
    • Recipe Nginx To S3
    • Recipe Nginx To Treasure Data
    • Recipe Syslog To Elasticsearch
    • Recipe Syslog To Mongo
    • Recipe Syslog To S3
    • Recipe Syslog To Treasure Data
    • Recipe Tsv To Elasticsearch
    • Recipe Tsv To Mongo
    • Recipe Tsv To S3
    • Recipe Tsv To Treasure Data
    • Ruby
    • Scala
    • Splunk Like Grep And Alert Email
Powered by GitBook
On this page
  • Message Delivery Semantics
  • Network Topology
  • Log Forwarder Configuration
  • Log Aggregator Configuration
  • Failure Case Scenarios
  • Forwarder Failure
  • Aggregator Failure
  • Trouble Shooting
  • "no nodes are available"

Was this helpful?

  1. Deployment

High Availability Config

For high-traffic websites, we recommend using a high availability configuration of Fluentd.

Message Delivery Semantics

Fluentd is designed primarily for event-log delivery systems.

In such systems, several delivery guarantees are possible:

  • At most once: Messages are immediately transferred. If the

    transfer succeeds, the message is never sent out again. However,

    many failure scenarios can cause lost messages (ex: no more write

    capacity)

  • At least once: Each message is delivered at least once. In failure

    cases, messages may be delivered twice.

  • Exactly once: Each message is delivered once and only once. This

    is what people want.

If the system "can't lose a single event", and must also transfer "exactly once", then the system must stop ingesting events when it runs out of write capacity. The proper approach would be to use synchronous logging and return errors when the event cannot be accepted.

That's why Fluentd provides 'At most once' and 'At least once' transfers. In order to collect massive amounts of data without impacting application performance, a data logger must transfer data asynchronously. This improves performance at the cost of potential delivery failures.

However, most failure scenarios are preventable. The following sections describe how to set up Fluentd's topology for high availability.

Network Topology

To configure Fluentd for high availability, we assume that your network consists of 'log forwarders' and 'log aggregators'.

'log forwarders' are typically installed on every node to receive local events. Once an event is received, they forward it to the 'log aggregators' through the network.

'log aggregators' are daemons that continuously receive events from the log forwarders. They buffer the events and periodically upload the data into the cloud.

Fluentd can act as either a log forwarder or a log aggregator, depending on its configuration. The next sections describes the respective setups. We assume that the active log aggregator has ip '192.168.0.1' and that the backup has ip '192.168.0.2'.

Log Forwarder Configuration

Please add the following lines to your config file for log forwarders. This will configure your log forwarders to transfer logs to log aggregators.

# TCP input
<source>
  @type forward
  port 24224
</source>

# HTTP input
<source>
  @type http
  port 8888
</source>

# Log Forwarding
<match mytag.**>
  @type forward

  # primary host
  <server>
    host 192.168.0.1
    port 24224
  </server>
  # use secondary host
  <server>
    host 192.168.0.2
    port 24224
    standby
  </server>

  # use longer flush_interval to reduce CPU usage.
  # note that this is a trade-off against latency.
  flush_interval 60s
</match>

When the active aggregator (192.168.0.1) dies, the logs will instead be sent to the backup aggregator (192.168.0.2). If both servers die, the logs are buffered on-disk at the corresponding forwarder nodes.

Log Aggregator Configuration

Please add the following lines to the config file for log aggregators. The input source for the log transfer is TCP.

# Input
<source>
  @type forward
  port 24224
</source>

# Output
<match mytag.**>
  ...
</match>

The incoming logs are buffered, then periodically uploaded into the cloud. If upload fails, the logs are stored on the local disk until the retransmission succeeds.

Failure Case Scenarios

Forwarder Failure

When a log forwarder receives events from applications, the events are first written into a disk buffer (specified by buffer_path). After every flush_interval, the buffered data is forwarded to aggregators.

This process is inherently robust against data loss. If a log forwarder's fluentd process dies, the buffered data is properly transferred to its aggregator after it restarts. If the network between forwarders and aggregators breaks, the data transfer is automatically retried.

However, possible message loss scenarios do exist:

  • The process dies immediately after receiving the events, but before

    writing them into the buffer.

  • The forwarder's disk is broken, and the file buffer is lost.

Aggregator Failure

When log aggregators receive events from log forwarders, the events are first written into a disk buffer (specified by buffer_path). After every flush_interval, the buffered data is uploaded into the cloud.

This process is inherently robust against data loss. If a log aggregator's fluentd process dies, the data from the log forwarder is properly retransferred after it restarts. If the network between aggregators and the cloud breaks, the data transfer is automatically retried.

However, possible message loss scenarios do exist:

  • The process dies immediately after receiving the events, but before

    writing them into the buffer.

  • The aggregator's disk is broken, and the file buffer is lost.

Trouble Shooting

"no nodes are available"

Please make sure that you can communicate with port 24224 using not only TCP, but also UDP. These commands will be useful for checking the network configuration.

$ telnet host 24224
$ nmap -p 24224 -sU host
PreviousRPCNextFailure Scenarios

Last updated 5 years ago

Was this helpful?

Please note that there is one where VMware will occasionally lose small UDP packages used for heartbeat.

If this article is incorrect or outdated, or omits critical information, please . is a open source project under . All components are available under the Apache 2 License.

known issue
let us know
Fluentd
Cloud Native Computing Foundation (CNCF)