This article desdrivbes how to optimize Fluentd's performance with in_multiprocess plugin. With high traffic, Fluentd tends to be more CPU bound.
However with Ruby MRI's GVL (Global VM Lock) limitation, Fluentd can use only single CPU core. With in_multiprocessm Fluentd can fully utilize multiple CPU cores to handle more requests.
Before you have multi-process configuration, please make sure you have done all the optimization you can do with single process.
In multi-process environment, we recommend to have the following 2-tier process topology within the same server.
These processes will be in charge for input (typicall either in_tail or other input plugins) and its parsing. The input processes will transfer all the records to 2nd tier output processes, by out_forward.
In the config, it will be recommended to use shorter
flush_interval (e.g. 1s) with smaller
buffer_chunk_size (e.g. 1MB) to immediately forward incoming data to 2nd tier output processes.
These process will be in charge for filter and output to your favorite destinations. The output processes accept incoming data from input processes, by in_forward.
The output processes are shared across all input processes, by out_forward's load balancing mechanism. In this way, you can simply add more output processes to handle more capacity across all inputs.
There are coupole of reasons why we recommend this design.
Each process can use nearly 100% of its CPU with much smaller focus (either input or output). For example,
in_tail can now focus on tailing and parsing, rather than sharing the CPU power for filtering and output.
To handle more volume, you can simply add more input and output processes.
out_forward automatically distribute the load across the processes automatically.
If any of the process goes down, the supervisor process will automatically relaunch the process. Also we recommend to use buf_file for both input and output processes, to simply prevent losing the data.
For example, even if one of the output processes die, the data gets buffered and routed to different output processes automatically. Also crashed processes will be automaticlaly relaunched by supervisor process.
This git repository contains the fully functional multi-process settings for Fluentd.
You can simply test with
td-agent, or directly use Ruby to launch Fluentd.
$ bundle install$ bundle exec fluentd -c fluentd.conf
This config uses
in_multiprocess to launch both input and output processes.
The example config contains 3 input processes:
These input processes will focus on receiving the data via either TCP or UDP, and sipmly forward to 2nd tier output processes.
The example config contains 4 output processes:
These output processes receive data from input processes, apply filters, and output to the destinations.
./tmpl/ directory has the config generation script, in case you'd like to increase the processes.
$ bash gen_out.sh
These files are used to collect metrics from Fluentd, and exposing the metrics to Prometheus monitoring system. The repository also has
./prometheus directory, which contains example
in_xyz.conf. Please make sure you have
<match> section, to forward incoming data to 2nd tier output processes. Finally, register the process in
out_N.conf, and register the process in both
fluentd.conf which list up all the output processes.
We recommend to have approximately similar number of total processes with your CPU cores.
When you perform the benchmark, please carefully look at which processes are the bottleneck. Please add the processes depending on its resource usages.
Fluentd v1.0 or later has native multi-process support. We recommend you to upgrade to simplify the config file if possible.
If this article is incorrect or outdated, or omits critical information, please let us know. Fluentd is a open source project under Cloud Native Computing Foundation (CNCF). All components are available under the Apache 2 License.