Filter Modify Apache
Last updated
Was this helpful?
Last updated
Was this helpful?
In this article, we introduce several common data manipulation challenges faced by our users (such as filtering and modifying data) and explain how to solve each task using one or more Fluentd plugins.
Let's suppose our Fluentd instances are collecting data from Apache web server logs via in_tail. Our goal is to filter out all the 200 requests.
is a plugin that can "grep" data according to the different fields within Fluentd events.
If our events looks like
then we can filter out all the requests with status code 200 as follows:
fluent-plugin-grep
can filter based on multiple fields as well. The config below keeps all requests with status code 4xx that are NOT referred from yourdomain.com (a real world use case: figuring out how many dead links there are in the wild by filtering out internal links)
When collecting data, we often need to add a new field or change an existing field in our log data. For example, many Fluentd users need to add the hostname of their servers to the Apache web server log data in order to compute the number of requests handled by each server (i.e., store them in MongoDB/HDFS and run GROUP-BYs).
If our events looks like
then we can add a new field with the hostname information as follows:
The modified events now look like
NOTE: The "#{Socket.gethostname}"
placeholder is interpreted at configuration parsing phase. It inlines the host name of the server that the Fluentd instance is running on (in this example, our server's name is "our_server").
By using the add_tag_prefix
option, we can prepend a tag in front of filtered events so that they can be matched to a subsequent section. For example, we can send all logs with non-200 status codes to , as shown below:
can add a new field to each data record.
If this article is incorrect or outdated, or omits critical information, please . is a open source project under . All components are available under the Apache 2 License.