Life of a Fluentd event
The following article describe a global overview of how events are processed by Fluentd using examples. It covers the complete cycle including Setup, Inputs, Filters, Matches and Labels.
Basic Setup
The configuration files is the fundamental piece to connect all things together, as it allows to define which Inputs or listeners Fluentd will have and set up common matching rules to route the Event data to a specific Output.
We will use the in_http and the out_stdout plugins as examples to describe the events cycle. The following is a basic definition on the configuration file to specify an http input, for short: we will be listening for HTTP Requests:
This definition specifies that a HTTP server will be listening on TCP port 8888. Now let's define a Matching rule and a desired output that will just print the data that arrived on each incoming request to standard output:
The Match directive sets a rule that matches each Incoming event that arrives with a Tag equal to test.cycle will use the Output plugin type called stdout. At this point we have an Input type, a Match and an Output. Let's test the setup using curl:
On the Fluentd server side the output should look like this:
Event structure
A Fluentd event consists of a tag, time and record.
tag: Where an event comes from. For message routing
time: When an event happens. Epoch time
record: Actual log content. JSON object
The input plugin is responsible for generating Fluentd events from specified data sources. For example, in_tail
generates events from text lines. If you have the following line in apache logs:
the following event is generated:
Processing Events
When a Setup is defined, the Router Engine already contains several rules to apply for different input data. Internally an Event will pass through a chain of procedures that may alter the Events cycle.
Now we will expand the previous basic example and add more steps in our Setup to demonstrate how the Events cycle can be altered. We will do this through the new Filters implementation.
Filters
A Filter aims to behave like a rule to either accept or reject an event. The following configuration adds a Filter definition:
As you can see, the new Filter definition added will be a mandatory step before passing control to the Match section. The Filter basically accepts or rejects the Event based on it type and the defined rule. For our example we want to discard any user logout
action, we only care about the login
action. The way this is accomplished is by the Filter doing a grep
that will exclude any message where the action
key has the string value "logout".
From a Terminal, run the following two curl
commands (please note that each one contains a different action
value):
Now looking at the Fluentd service output we can see that only the event with action
equal to "login" is matched. The logout
Event was discarded:
As you can see, the Events follow a step-by-step cycle where they are processed in order from top to bottom. The new engine on Fluentd allows integrating as many Filters as needed. Considering that the configuration file might grow and start getting a bit complex, a new feature called Labels has been added that aims to help manage this complexity.
Labels
The Labels implementation aims to reduce configuration file complexity and allows to define new Routing sections that do not follow the top to bottom order, instead acting like linked references. Using the previous example we will modify the setup as follows:
This new configuration contains a @label
key on the source
indicating that any further steps take place on the STAGING Label section. Every Event reported on the Source is routed by the Routing engine and continue processing on STAGING, skipping the old filter definition.
Buffers
In this example, we use stdout
non-buffered output, but in production buffered outputs are often necessary, e.g. forward
, mongodb
, s3
and etc. Buffered output plugins store received events into buffers and are then written out to a destination after meeting flush conditions. Using buffered output you don't see received events immediately, unlike stdout
non-buffered output.
Buffers are important for reliability and throughput. See Output and Buffer articles.
Execution unit
This section describes the internal implementation.
Fluentd's events are processed on input plugin thread by default. For example, if you have in_tail -> filter_grep -> out_stdout
pipeline, this pipeline is executed on in_tail's thread. The important point is filter_grep
and out_stdout
plugins don't have own thread for processing.
For Buffered Output, output plugin has own threads for flushing buffer. For example, if you have in_tail -> filter_grep -> out_forward
pipeline, this pipeline is executed on in_tail's thread and out_forward's flush thread. in_tail's thread processes events until events are written into buffer. Buffer flush and its error handling is the output responsibility. This de-couples the input/output and this model has merits, separate responsibilities, easy error handling, good performance.
Conclusion
Once the events are reported by the Fluentd engine on the Source they can be processed step by step or inside a referenced Label, with any Event being filtered out at any moment. The new Routing engine behavior aims to provide more flexibility and simplifies the processing before events reach the Output plugin.
Learn More
If this article is incorrect or outdated, or omits critical information, please let us know. Fluentd is a open source project under Cloud Native Computing Foundation (CNCF). All components are available under the Apache 2 License.
Last updated