Failure Scenarios
Last updated
Was this helpful?
Last updated
Was this helpful?
This article describes various Fluentd failure scenarios. We will assume that you have configured Fluentd for , so that each app node has its local forwarders and all logs are aggregated into multiple aggregators.
The application sometimes fails to post records to its local Fluentd instance when using logger libraries of various languages. Depending on the maturity of each logger library, some clever mechanisms have been implemented to prevent data loss.
If the destination Fluentd instance dies, certain logger implementations will use extra memory to hold the incoming logs. When Fluentd comes back alive, these loggers will automatically retry to send the buffered logs to Fluentd again. Once the maximum buffer memory size is reached, most current implementations will write the data onto the disk or throw away the logs.
When trying to resend logs to the local forwarder, some implementations will use exponential backoff to prevent excessive re-connect requests.
What happens when a Fluentd process dies for any reason? It depends on your buffer configuration.
buf_memory
If you are using , the buffered data is completely lost. This is a tradeoff for higher performance. Lowering the flush_interval
will reduce the probability of data loss, but will increase the number of transfers between forwarders and aggregators.
buf_file
If you are using , the buffered data is stored on the disk. After Fluentd recovers, it will try to send the buffered data to the destination again.
Please note that the data will be lost if the buffer file is broken due to I/O errors. The data will also be lost if the disk is full.
If the storage destination (e.g. Amazon S3, MongoDB, HDFS, etc.) goes down, Fluentd will keep trying to resend the buffered data. The retry logic depends on the plugin implementation.
If you're using , the aggregators will stop accepting new logs once they reach their buffer limits. If you are using , the aggregators will continue accepting logs until they run out of disk space.
If this article is incorrect or outdated, or omits critical information, please . is an open-source project under . All components are available under the Apache 2 License.