Fluentd
0.12
0.12
  • Introduction
  • Overview
    • Getting Started
    • Installation
    • Life of a Fluentd event
    • Support
    • FAQ
  • Use Cases
    • Centralized App Logging
    • Monitoring Service Logs
    • Data Analytics
    • Connecting to Data Storages
    • Stream Processing
    • Windows Event Collection
    • IoT Data Logger
  • Configuration
    • Config File Syntax
    • Routing Examples
    • Recipes
  • Deployment
    • Logging
    • Monitoring
    • Signals
    • RPC
    • High Availability Config
    • Failure Scenarios
    • Performance Tuning
    • Plugin Management
    • Trouble Shooting
    • Secure Forwarding
    • Fluentd UI
    • Command Line Option
  • Container Deployment
    • Docker Image
    • Docker Logging Driver
    • Docker Compose
    • Kubernetes
  • Input Plugins
    • tail
    • forward
    • secure_forward
    • udp
    • tcp
    • http
    • unix
    • syslog
    • exec
    • scribe
    • multiprocess
    • dummy
    • Others
  • Output Plugins
    • file
    • s3
    • kafka
    • forward
    • secure_forward
    • exec
    • exec_filter
    • copy
    • geoip
    • roundrobin
    • stdout
    • null
    • webhdfs
    • splunk
    • mongo
    • mongo_replset
    • relabel
    • rewrite_tag_filter
    • Others
  • Buffer Plugins
    • memory
    • file
  • Filter Plugins
    • record_transformer
    • grep
    • parser
    • stdout
  • Parser Plugins
    • regexp
    • apache2
    • apache_error
    • nginx
    • syslog
    • ltsv
    • csv
    • tsv
    • json
    • multiline
    • none
  • Formatter Plugins
    • out_file
    • json
    • ltsv
    • csv
    • msgpack
    • hash
    • single_value
  • Developer
    • Plugin Development
    • Community
    • Mailing List
    • Source Code
    • Bug Tracking
    • ChangeLog
    • Logo
  • Articles
    • Store Apache Logs into MongoDB
    • Apache To Riak
    • Store Apache Logs into Amazon S3
    • Before Install
    • Cep Norikra
    • Collect Glusterfs Logs
    • Common Log Formats
    • Docker Logging Efk Compose
    • Docker Logging
    • Filter Modify Apache
    • Forwarding Over Ssl
    • Free Alternative To Splunk By Fluentd
    • Data Collection to Hadoop (HDFS)
    • Data Analytics with Treasure Data
    • Install By Chef
    • Install By Deb
    • Install By Dmg
    • Install By Gem
    • Install By Rpm
    • Install From Source
    • Install On Beanstalk
    • Install On Heroku
    • Java
    • Kinesis Stream
    • Kubernetes Fluentd
    • Monitoring by Prometheus
    • Monitoring by Rest Api
    • Nodejs
    • Performance Tuning Multi Process
    • Performance Tuning Single Process
    • Perl
    • Php
    • Python
    • Quickstart
    • Raspberrypi Cloud Data Logger
    • Recipe Apache Logs To Elasticsearch
    • Recipe Apache Logs To Mongo
    • Recipe Apache Logs To S3
    • Recipe Apache Logs To Treasure Data
    • Recipe Cloudstack To Mongodb
    • Recipe Csv To Elasticsearch
    • Recipe Csv To Mongo
    • Recipe Csv To S3
    • Recipe Csv To Treasure Data
    • Recipe Http Rest Api To Elasticsearch
    • Recipe Http Rest Api To Mongo
    • Recipe Http Rest Api To S3
    • Recipe Http Rest Api To Treasure Data
    • Recipe Json To Elasticsearch
    • Recipe Json To Mongo
    • Recipe Json To S3
    • Recipe Json To Treasure Data
    • Recipe Nginx To Elasticsearch
    • Recipe Nginx To Mongo
    • Recipe Nginx To S3
    • Recipe Nginx To Treasure Data
    • Recipe Syslog To Elasticsearch
    • Recipe Syslog To Mongo
    • Recipe Syslog To S3
    • Recipe Syslog To Treasure Data
    • Recipe Tsv To Elasticsearch
    • Recipe Tsv To Mongo
    • Recipe Tsv To S3
    • Recipe Tsv To Treasure Data
    • Ruby
    • Scala
    • Splunk Like Grep And Alert Email
Powered by GitBook
On this page
  • Scenario: Filtering Data by the Value of a Field
  • Solution: Use fluent-plugin-grep
  • Scenario: Adding a New Field (such as hostname)
  • Solution: Use fluent-plugin-record-modifier

Was this helpful?

  1. Articles

Filter Modify Apache

PreviousDocker LoggingNextForwarding Over Ssl

Last updated 5 years ago

Was this helpful?

In this article, we introduce several common data manipulation challenges faced by our users (such as filtering and modifying data) and explain how to solve each task using one or more Fluentd plugins.

Scenario: Filtering Data by the Value of a Field

Let's suppose our Fluentd instances are collecting data from Apache web server logs via in_tail. Our goal is to filter out all the 200 requests.

Solution: Use fluent-plugin-grep

is a plugin that can "grep" data according to the different fields within Fluentd events.

If our events looks like

{
    "code": 200,
    "url": "http://yourdomain.com/page.html",
    "size": 2344,
    "referer": "http://www.treasuredata.com"
    ...
}

then we can filter out all the requests with status code 200 as follows:

...
<match apache.**>
    @type grep
    input_key code
    exclude ^200$
    add_tag_prefix filtered
</match>
...
<match apache.**>
    @type grep
    input_key code
    exclude ^200$
    add_tag_prefix filtered
</match>
<match filtered.apache.**>
    @type td_log
    apikey XXXXX
    ...
</match>

fluent-plugin-grep can filter based on multiple fields as well. The config below keeps all requests with status code 4xx that are NOT referred from yourdomain.com (a real world use case: figuring out how many dead links there are in the wild by filtering out internal links)

...
<match apache.**>
    @type grep
    regexp1 code ^4\d\d$
    exclude1 referer ^https?://yourdomain.com
    add_tag_prefix external_dead_links
</match>
...

Scenario: Adding a New Field (such as hostname)

When collecting data, we often need to add a new field or change an existing field in our log data. For example, many Fluentd users need to add the hostname of their servers to the Apache web server log data in order to compute the number of requests handled by each server (i.e., store them in MongoDB/HDFS and run GROUP-BYs).

Solution: Use fluent-plugin-record-modifier

If our events looks like

{"code":200, "url":"http://yourdomain.com", "size":1232}

then we can add a new field with the hostname information as follows:

<match foo.bar>
    @type record_modifier
    gen_host "#{Socket.gethostname}"
    tag with_hostname
</match>
...
<match with_hostname>
    ...
</match>

The modified events now look like

{"gen_host": "our_server", code":200, "url":"http://yourdomain.com", "size":1232}

NOTE: The "#{Socket.gethostname}" placeholder is interpreted at configuration parsing phase. It inlines the host name of the server that the Fluentd instance is running on (in this example, our server's name is "our_server").

By using the add_tag_prefix option, we can prepend a tag in front of filtered events so that they can be matched to a subsequent section. For example, we can send all logs with non-200 status codes to , as shown below:

can add a new field to each data record.

If this article is incorrect or outdated, or omits critical information, please . is a open source project under . All components are available under the Apache 2 License.

fluent-plugin-grep
Treasure Data
fluent-plugin-record-modifier
let us know
Fluentd
Cloud Native Computing Foundation (CNCF)