Life of a Fluentd event
The following article gives a general overview of how events are processed by Fluentd with examples. It covers the complete lifecycle including Setup, Inputs, Filters, Matches and Labels.
Basic Setup
The configuration file is the fundamental piece to connect all things together, as it allows to define which Inputs or listeners Fluentd will have and set up common matching rules to route the Event data to a specific Output.
We will use the in_http
and the out_stdout
plugins as examples to describe the events cycle. The following is a basic definition on the configuration file to specify an http
input, for short: we will be listening for HTTP Requests:
The definition specifies that an HTTP server will be listening on TCP port 8888
.
Now, let's define a Matching rule to print the incoming requests to the standard output:
The Match sets a rule where each Incoming event that arrives with a Tag equals to test.cycle
, will match and use the Output plugin type called stdout
. At this point we have an Input type, a Match and an Output.
Let's test this setup using curl
:
The Fluentd logs should look like this:
Event Structure
A Fluentd event consists of three components:
tag
: Specifies the origin where an event comes from. It is used formessage routing.
time
: Specifies the time when an event happens with nanosecond resolution.record
: Specifies the actual log as a JSON object.
The input plugin is responsible for generating the Fluentd event from data sources. For example, in_tail
generates events from text lines. If you have the following Apache log:
You get the following Fluentd event:
Processing Events
When a Setup is defined, the Router Engine contains several predefined rules to apply to different input data. Internally, an Event will pass through a chain of procedures that may alter its lifecycle.
Now, we will expand on our previous basic example and add more steps in our Setup to demonstrate how the Events cycle can be altered. We will do this through the new Filters implementation.
Filters
A Filter behaves like a rule to pass or reject an event. The following configuration adds a Filter definition:
As you can see, the new Filter definition will be a mandatory step to pass before the control goes to the Match section. The Filter basically will accept or reject the Event based on its type
and rule. For our example we want to discard any user logout action. We only care about the logins. The way to accomplish this, is doing a grep
inside the Filter to exclude any message on which action
key have the logout string.
From a terminal, run the following two curl
commands containing different action
values:
Fluentd logs show only one login
message. The logout
event has been discarded:
As you can see, the Events follow a step-by-step cycle where they are processed in order, from top-to-bottom. The new engine allows to integrate many Filters as required. Also, considering that the configuration file may grow and start getting a bit complex for the readers, a new feature called Labels has been introduced to solve this potential problem.
Labels
This new implementation called Labels, aims to solve the configuration file complexity and allows to define new Routing sections that do not follow the top-to-bottom order, instead they act like linked references. Taking the previous example, we will modify the setup as follows:
The new configuration contains a @label
parameter under source
indicating that the further steps will take place on the @STAGING
label section. The expectation is that every event reported on the Source, the Routing Engine will continue processing on @STAGING
. Hence, it will skip the old filter definition.
Buffers
In this example, we use stdout
, the non-buffered output. But in production, you use outputs in buffered mode e.g. forward
, mongodb
, s3
and etc. An output plugin using buffered mode first stores the received events into buffers and then writes out buffers to a destination after meeting flush conditions. So, using the buffered output, you do not see the received events immediately unlike stdout
non-buffered output.
Buffer is important for reliability and throughput. See Output and Buffer articles.
Conclusion
Once the events are reported by the Fluentd engine on the Source, they are processed step-by-step or inside a referenced Label. Any Event may be filtered out at any moment. The new Routing Engine behavior provides more flexibility and makes easier the processing before reaching the Output plugin.
Learn More
If this article is incorrect or outdated, or omits critical information, please let us know. Fluentd is an open-source project under Cloud Native Computing Foundation (CNCF). All components are available under the Apache License 2.0.
Last updated