Fluentd
0.12
0.12
  • Introduction
  • Overview
    • Getting Started
    • Installation
    • Life of a Fluentd event
    • Support
    • FAQ
  • Use Cases
    • Centralized App Logging
    • Monitoring Service Logs
    • Data Analytics
    • Connecting to Data Storages
    • Stream Processing
    • Windows Event Collection
    • IoT Data Logger
  • Configuration
    • Config File Syntax
    • Routing Examples
    • Recipes
  • Deployment
    • Logging
    • Monitoring
    • Signals
    • RPC
    • High Availability Config
    • Failure Scenarios
    • Performance Tuning
    • Plugin Management
    • Trouble Shooting
    • Secure Forwarding
    • Fluentd UI
    • Command Line Option
  • Container Deployment
    • Docker Image
    • Docker Logging Driver
    • Docker Compose
    • Kubernetes
  • Input Plugins
    • tail
    • forward
    • secure_forward
    • udp
    • tcp
    • http
    • unix
    • syslog
    • exec
    • scribe
    • multiprocess
    • dummy
    • Others
  • Output Plugins
    • file
    • s3
    • kafka
    • forward
    • secure_forward
    • exec
    • exec_filter
    • copy
    • geoip
    • roundrobin
    • stdout
    • null
    • webhdfs
    • splunk
    • mongo
    • mongo_replset
    • relabel
    • rewrite_tag_filter
    • Others
  • Buffer Plugins
    • memory
    • file
  • Filter Plugins
    • record_transformer
    • grep
    • parser
    • stdout
  • Parser Plugins
    • regexp
    • apache2
    • apache_error
    • nginx
    • syslog
    • ltsv
    • csv
    • tsv
    • json
    • multiline
    • none
  • Formatter Plugins
    • out_file
    • json
    • ltsv
    • csv
    • msgpack
    • hash
    • single_value
  • Developer
    • Plugin Development
    • Community
    • Mailing List
    • Source Code
    • Bug Tracking
    • ChangeLog
    • Logo
  • Articles
    • Store Apache Logs into MongoDB
    • Apache To Riak
    • Store Apache Logs into Amazon S3
    • Before Install
    • Cep Norikra
    • Collect Glusterfs Logs
    • Common Log Formats
    • Docker Logging Efk Compose
    • Docker Logging
    • Filter Modify Apache
    • Forwarding Over Ssl
    • Free Alternative To Splunk By Fluentd
    • Data Collection to Hadoop (HDFS)
    • Data Analytics with Treasure Data
    • Install By Chef
    • Install By Deb
    • Install By Dmg
    • Install By Gem
    • Install By Rpm
    • Install From Source
    • Install On Beanstalk
    • Install On Heroku
    • Java
    • Kinesis Stream
    • Kubernetes Fluentd
    • Monitoring by Prometheus
    • Monitoring by Rest Api
    • Nodejs
    • Performance Tuning Multi Process
    • Performance Tuning Single Process
    • Perl
    • Php
    • Python
    • Quickstart
    • Raspberrypi Cloud Data Logger
    • Recipe Apache Logs To Elasticsearch
    • Recipe Apache Logs To Mongo
    • Recipe Apache Logs To S3
    • Recipe Apache Logs To Treasure Data
    • Recipe Cloudstack To Mongodb
    • Recipe Csv To Elasticsearch
    • Recipe Csv To Mongo
    • Recipe Csv To S3
    • Recipe Csv To Treasure Data
    • Recipe Http Rest Api To Elasticsearch
    • Recipe Http Rest Api To Mongo
    • Recipe Http Rest Api To S3
    • Recipe Http Rest Api To Treasure Data
    • Recipe Json To Elasticsearch
    • Recipe Json To Mongo
    • Recipe Json To S3
    • Recipe Json To Treasure Data
    • Recipe Nginx To Elasticsearch
    • Recipe Nginx To Mongo
    • Recipe Nginx To S3
    • Recipe Nginx To Treasure Data
    • Recipe Syslog To Elasticsearch
    • Recipe Syslog To Mongo
    • Recipe Syslog To S3
    • Recipe Syslog To Treasure Data
    • Recipe Tsv To Elasticsearch
    • Recipe Tsv To Mongo
    • Recipe Tsv To S3
    • Recipe Tsv To Treasure Data
    • Ruby
    • Scala
    • Splunk Like Grep And Alert Email
Powered by GitBook
On this page
  • Installation
  • Example Fluentd Configuration
  • Step 1: Counting Incoming Records by Prometheus Filter Plugin
  • Step 2: Counting Outgoing Records by Prometheus Output Plugin
  • Step 3: Expose Metrics by Prometheus Input Plugin via HTTP
  • Step 4: Check the Configuration
  • Example Prometheus Configuration
  • How to use Prometheus to monitor Fluentd
  • List of Fluentd nodes
  • List of Fluentd metrics
  • Example Prometheus Queries
  • Metrics to Monitor
  • Grafana for Advanced Visualization / Alerting
  • Further Readings

Was this helpful?

  1. Articles

Monitoring by Prometheus

PreviousKubernetes FluentdNextMonitoring by Rest Api

Last updated 2 years ago

Was this helpful?

This article describes how to monitor Fluentd via .

Since both Prometheus and Fluentd are under , Fluentd project is recommending to use Prometheus by default to monitor Fluentd.

Installation

First of all, please install fluent-plugin-prometheus gem.

$ fluent-gem install fluent-plugin-prometheus --version=0.4.0

If you are using td-agent, use td-agent-gem for installation.

$ sudo td-agent-gem install fluent-plugin-prometheus --version=0.4.0

Example Fluentd Configuration

To expose the Fluentd metrics to Prometheus, we need to configure 3 parts:

  • Step 1: Prometheus Filter Plugin to count Incoming Records

  • Step 2: Prometheus Output Plugin to count Outgoing Records

  • Step 3: Prometheus Input Plugin to expose metrics via HTTP

Step 1: Counting Incoming Records by Prometheus Filter Plugin

First, please add the <filter> section like below, to count the incoming records per tag. With this configuration, prometheus filter starts adding the internal counter as the record comes in.

# source
<source>
  @type forward
  bind 0.0.0.0
  port 24224
</source>

# count number of incoming records per tag
<filter company.*>
  @type prometheus
  <metric>
    name fluentd_input_status_num_records_total
    type counter
    desc The total number of incoming records
    <labels>
      tag ${tag}
      hostname ${hostname}
    </labels>
  </metric>
</filter>

Step 2: Counting Outgoing Records by Prometheus Output Plugin

Second, please use copy plugin with prometheus output plugin, to count the outgoing records per tag. With this configuration, prometheus output starts adding the internal counter as the record goes out.

# count number of outgoing records per tag
<match company.*>
  @type copy
  <store>
    @type forward
    <server>
      name myserver1
      hostname 192.168.1.3
      port 24224
      weight 60
    </server>
  </store>
  <store>
    @type prometheus
    <metric>
      name fluentd_output_status_num_records_total
      type counter
      desc The total number of outgoing records
      <labels>
        tag ${tag}
        hostname ${hostname}
      </labels>
    </metric>
  </store>
</match>

Step 3: Expose Metrics by Prometheus Input Plugin via HTTP

Finally, please use prometheus input plugin to expose internal counter information via HTTP.

# expose metrics in prometheus format
<source>
  @type prometheus
  bind 0.0.0.0
  port 24231
  metrics_path /metrics
</source>
<source>
  @type prometheus_output_monitor
  interval 10
  <labels>
    hostname ${hostname}
  </labels>
</source>

Step 4: Check the Configuration

After you have done 3 changes, please restart fluentd.

# For stand-alone Fluentd installations
$ fluentd -c fluentd.conf
# For td-agent users
$ sudo /etc/init.d/td-agent restart

Let's send some records.

$ echo '{"message":"hello"}' | bundle exec fluent-cat company.test1
$ echo '{"message":"hello"}' | bundle exec fluent-cat company.test1
$ echo '{"message":"hello"}' | bundle exec fluent-cat company.test1
$ echo '{"message":"hello"}' | bundle exec fluent-cat company.test2
curl http://localhost:24231/metrics
# TYPE fluentd_input_status_num_records_total counter
# HELP fluentd_input_status_num_records_total The total number of incoming records
fluentd_input_status_num_records_total{tag="company.test",host="KZK.local"} 3.0
fluentd_input_status_num_records_total{tag="company.test2",host="KZK.local"} 1.0
# TYPE fluentd_output_status_num_records_total counter
# HELP fluentd_output_status_num_records_total The total number of outgoing records
fluentd_output_status_num_records_total{tag="company.test",host="KZK.local"} 3.0
fluentd_output_status_num_records_total{tag="company.test2",host="KZK.local"} 1.0
# TYPE fluentd_output_status_buffer_queue_length gauge
# HELP fluentd_output_status_buffer_queue_length Current buffer queue length.
fluentd_output_status_buffer_queue_length{hostname="KZK.local",plugin_id="object:3fcbccc6d388",type="forward"} 1.0
....

Example Prometheus Configuration

Please prepare the file below as prometheus.yml.

global:
  scrape_interval: 10s # Set the scrape interval to every 10 seconds. Default is every 1 minute.

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  - job_name: 'fluentd'
    static_configs:
      - targets: ['localhost:24231']

Then, launch prometheus process.

$ ./prometheus --config.file="prometheus.yml"

Now please open your browser and access to http://localhost:9090/.

How to use Prometheus to monitor Fluentd

List of Fluentd nodes

If you go to http://localhost:9090/targets, Prometheus will show you a list of Fluentd nodes and its status.

List of Fluentd metrics

Then, visit http://localhost:9090/graph to explore Fluentd internal metrics. There, you'll see 8 metrics in the metric list:

  • fluentd_input_status_num_records_total

  • fluentd_output_status_buffer_queue_length

  • fluentd_output_status_buffer_total_bytes

  • fluentd_output_status_emit_count

  • fluentd_output_status_num_errors

  • fluentd_output_status_num_records_total

  • fluentd_output_status_retry_count

  • fluentd_output_status_retry_wait

Please pick fluentd_input_status_num_records_total, and you'll see the total incoming records per tag.

Example Prometheus Queries

Here are the example PromQLs for common metrics everyone wants to see.

# number of available nodes
up

# incoming records / sec / host
sum(rate(fluentd_input_status_num_records_total[1m])) by (hostname)

# incoming records / sec / tag
sum(rate(fluentd_input_status_num_records_total[1m])) by (tag)

# outgoing records / sec / host
sum(rate(fluentd_output_status_num_records_total[1m])) by (hostname)

# outgoing records / sec / tag
sum(rate(fluentd_output_status_num_records_total[1m])) by (tag)

# emit count / sec
rate(fluentd_output_status_emit_count[1m])

Metrics to Monitor

In addition to the traffic metrics introduced above, it is important to monitor the queue length and error count.

If these values are increasing, it means Fluentd cannot flush the buffer to the destination. Thus you will lose the data once the buffer becomes full.

# maximum buffer length in last 1min
max_over_time(fluentd_output_status_buffer_queue_length[1m])

# maximum buffer bytes in last 1min
max_over_time(fluentd_output_status_buffer_total_bytes[1m])

# maximum retry wait in last 1min
max_over_time(fluentd_output_status_retry_wait[1m])

# retry count / sec
rate(fluentd_output_status_retry_count[1m])

Grafana for Advanced Visualization / Alerting

Further Readings

Then, please access to http://localhost:24231/metrics, which is the URL to receive metrics in .

Since fluentd_input_status_num_records_total and fluentd_output_status_num_records_total are monotonically increasing numbers, it requires a little bit of calculation by to make them meaningful.

For more advanced visualization and alerting, we recommend to use as a visualization frontend for Prometheus.

If this article is incorrect or outdated, or omits critical information, please . is a open source project under . All components are available under the Apache 2 License.

Prometheus
CNCF (Cloud Native Computing Foundation)
Prometheus format
PromQL (Prometheus Query Language)
Grafana
Grafana Support for Prometheus
Prometheus Documentation
Grafana Documentation
let us know
Fluentd
Cloud Native Computing Foundation (CNCF)