Buffer plugins are used by output plugins. For example,
buf_file by default to store incoming stream temporally before transmitting to S3.
Buffer plugins are, as you can tell by the name, pluggable. So you can choose a suitable backend based on your system requirements.
A buffer is essentially a set of "chunks". A chunk is a collection of events concatenated into a single blob. Each chunk is managed one by one in the form of files (buf_file) or continuous memory blocks (buf_memory).
You can think of a chunk as a cargo box. A buffer plugin uses a chunk as a lightweight container, and fills it with events incoming from input sources. If a chunk becomes full, then it gets "shipped" to the destination.
Internally, a buffer plugin has two separated places to store its chunks: "stage" where chunks get filled with events, and "queue" where chunks wait before the transportation. Every newly-created chunk starts from stage, then proceeds to queue in time (and subsequently gets transferred to the destination).
A chunk can fail to be written out to the destination for a number of reasons. The network can go down, or the traffic volumes can exceed the capacity of the destination node. To handle such common failures gracefully, buffer plugins are equipped with a built-in retry mechanism.
By default, Fluentd increases the wait interval exponentially for each retry attempt. For example, assuming that the initial wait interval is set to 1 second and the exponential factor is 2, each attempt occurs at the following time points:
1 2 4 8 16x-x---x-------x---------------x-------------------------│ │ │ │ └─ 4th retry (wait = 8s)│ │ │ └───────────────── 3th retry (wait = 4s)│ │ └───────────────────────── 2th retry (wait = 2s)│ └───────────────────────────── 1th retry (wait = 1s)└─────────────────────────────── FAIL
Note that, in practice, Fluentd tweaks this algorithm in a few aspects:
Wait intervals are randomized by default. That is, Fluentd
diversifies the wait interval by multiplying by a randomly-chosen
number between 0.875 and 1.125. You can turn off this behaviour by
retry_randomize to false.
Wait intervals can be capped to a certain limit. For example,
if you set
retry_max_interval to 5 seconds in the example above,
the 4th retry will wait for 5 seconds, instead of 8 seconds.
If you want to disable the exponential backoff, set the
retry_type option to "periodic".
Fluentd will abort the attempt to transfer the failing chunks on the following conditions:
The number of retries exceeds
retry_max_times (default: none)
The seconds elapsed since the first retry exceeds
In these events, all chunks in the queue are discarded. If you want to avoid this, you can enable
retry_forever to make Fluentd retry indefinitely.
Not all errors are recoverable in nature. For example, if the content of a chunk file gets corrupted, you obviously cannot fix anything just by redoing the write operation. Rather, a blind retry attempt will just make the situation worse.
Since v1.2.0, Fluentd can detect these non-recoverable failures. If these kinds of fatal errors occur, Fluentd will abort the chunk immediately and move it into
secondary or the backup directory. The exact location of the backup directory is determined by the parameter
If you don't need to back up chunks, you can enable
disable_chunk_backup (available since v1.2.6) in the
The following is the current list of exceptions considered "unrecoverable":
Exception The typical cause of this error
Fluent::UnrecoverableError Output plugin can use this exception to suppress further retry attempts for plugin specific unrecoverable error.
TypeError Occurs when an event has unexpected type in its target field.
ArgumentError Occurs when the plugin uses the library wrongly.
NoMethodError Occurs when events and configuration are mismatched.
Here are the patterns when unrecoverable error happens:
If the plugin doesn't have
secondary, the chunk is moved to backup
If the plugin has
secondary which is different type from primary,
the chunk is moved to
If the unrecoverable error happens inside
secondary, the chunk is
moved to backup directory.
Below is a full configuration example which covers all the parameters controlling retry bahaviours.
<system>root_dir /var/log/fluentd # For handling unrecoverable chunks</system><buffer>retry_wait 1 # The wait interval for the first retry.retry_exponential_backoff_base 2 # Inclease the wait time by a factor of N.retry_type exponential_backoff # Set 'periodic' for constant intervals.# retry_max_interval 1h # Cap the wait interval. (see above)retry_randomize true # Apply randomization. (see above)retry_timeout 72h # Maximum duration before giving up.# retry_max_times 17 # Maximum retry count before giving up.retry_forever false # Set 'true' for infinite retry loops.retry_secondary_threshold 0.8 # See the "Secondary Output" section in</buffer> # 'Output Plugins' > 'Overview'.
Normally, you don't need to specify every option as in this example, because these options are, in fact, optional. As for the detail of each option, please read this article.
Because the format of buffer chunk is different from output's payload. Let's use elasticsearch output plugin,
out_elasticsearch, for the detailed explanation.
out_elasticsearch uses MessagePack for buffer's serialization(NOTE that this depends on plugin). On the other hand, Elasticsearch's bulk API requires json based payload. It means one MessagePack-ed record is converted into 2 json lines. So the payload size is larger than buffer's chunk size.
This sometimes causes the problem when output destination has the payload size limitation. If you have a problem with payload size issue, check chunk size configuration and API spec.
If this article is incorrect or outdated, or omits critical information, please let us know. Fluentd is a open source project under Cloud Native Computing Foundation (CNCF). All components are available under the Apache 2 License.