tail
Last updated
Last updated
The in_tail
Input plugin allows Fluentd to read events from the tail of text files. Its behavior is similar to the tail -F
command.
It is included in Fluentd's core.
Refer to the Configuration File article for the basic structure and syntax of the configuration file.
For <parse>
, see Parse Section.
When Fluentd is first configured with in_tail
, it will start reading from the tail of that log, not the beginning. Once the log is rotated, Fluentd starts reading the new file from the beginning. It keeps track of the current inode number.
If td-agent
restarts, it resumes reading from the last position before the restart. This position is recorded in the position file specified by the pos_file
parameter.
Since v1.12.0, in_tail
handles the following Linux capabilities if Fluentd's Linux capability handling module is enabled:
CAP_DAC_READ_SEARCH
(:dac_read_search
on in_tail
code.)
CAP_DAC_OVERRIDE
(:dac_override
on in_tail
code.)
See also: Linux capability
See Common Parameters.
@type
(required)The value must be tail
.
tag
string
required parameter
0.14.0
The tag of the event.
*
can be used as a placeholder that expands to the actual file path, replacing '/'
with '.'
.
With the following configuration:
in_tail
emits the parsed events with the foo.path.to.file
tag.
path
string
required parameter
0.14.0
The path(s) to read. Multiple paths can be specified, separated by comma ','
.
*
and strftime
format can be included to add/remove the watch file dynamically. At the interval of refresh_interval
, Fluentd refreshes the list of watch files.
For multiple paths:
If the date is 20140401
, Fluentd starts to watch the files in /path/to/2014/04/01
directory. See also read_from_head
parameter.
By default, You should not use *
with log rotation because it may cause the log duplication. To avoid log duplication, you need to set follow_inodes true
in the configuration.
If you want to use other glob patterns such as []
and ?
, you need to set up glob_policy extended
as described in the glob_policy
section.
path_timezone
string
nil
1.8.1
This parameter is for strftime
formatted path like /path/to/%Y/%m/%d/
.
in_tail
uses system timezone by default. This parameter overrides it:
For timezone format, see Timezone Section.
glob_policy
enum
backward_compatible
backward_compatible/extended/always
1.17.0
This parameter permits to extend glob patterns on path
and exclude_path
parameters. When specifying extended
, users can use []
and ?
in glob patterns. When specifying always
, users can use []
, ?
, and additionally {}
in glob patterns.
However, always
option is not able to use with the default value of path_delimiter
. When using the default value of path_delimiter
, it will be marked as Fluent::ConfigError
.
exclude_path
array
[]
(empty)
0.14.0
The paths excluded from the watcher list.
For example, to remove the compressed files, you can use the following pattern:
exclude_path
takes input as an array, unlike path
which takes as a string.
follow_inodes
bool
false
1.12.0
Avoid to read rotated files duplicately. You should set true
when you use *
or strftime
format in path
.
refresh_interval
time
60 (seconds)
0.14.0
The interval to refresh the list of watch files. This is used when the path includes *
.
limit_recently_modified
time
nil (disabled)
0.14.13
Limits the watching files that the modification time is within the specified time range when using *
in path
.
skip_refresh_on_startup
bool
false
0.14.13
Skips the refresh of the watch list on startup. This reduces the startup time when *
is used in path
.
read_from_head
bool
false
0.14.0
Starts to read the logs from the head of the file or the last read position recorded in pos_file
, not tail.
Notes:
in_tail
tries to read a file during the startup phase when this is true
. So that if the target file is too large and takes a long time to read it, other plugins are blocked to start until the reading is finished. You can avoid it by skip_refresh_on_startup
.
For Fluentd <= v1.14.2: If you use *
or strftime
format as path
and new files may be added into such paths while tailing, you should set this parameter to true
. Otherwise some logs in newly added files may be lost. On the other hand you should guarantee that the log rotation will not occur in *
directory in that case to avoid log duplication. Or you can use follow_inodes true
to avoid such log duplication, which is available as of v1.12.0.
From Fluentd v1.14.3, in_tail
reads newly added files from head automatically even if read_from_head
is false
. read_from_head false
is affected only on start up.
encoding
, from_encoding
string
nil (string encoding is ASCII-8BIT
)
0.14.0
Specifies the encoding of reading lines.
By default, in_tail
emits string value as ASCII-8BIT encoding.
These options change it:
If encoding
is specified, in_tail
changes string to encoding
.
This uses Ruby's String#force_encoding
.
If encoding
and from_encoding
both are specified, in_tail
tries to
encode string from from_encoding
to encoding
. This uses Ruby's
You can get the list of supported encodings with this command:
read_lines_limit
integer
1000
0.14.0
The number of lines to read with each I/O operation.
If you see chunk bytes limit exceeds for an emitted event stream
or similar log with in_tail
, set a smaller value.
read_bytes_limit_per_second
size
-1 (unlimited)
1.13.0
The number of reading bytes per second to read with I/O operation.
This value should be equal or greater than 8192.
If you work with a big cluster with high volume of log, you can use this parameter to avoid network saturation and make it easier to calculate the max throughput per node. To restrict shipping log volumes per second, set a positive number.
max_line_size
size
nil
1.14.4
The maximum length of a line. Longer lines than it will be just skipped.
If you see BufferChunkOverflowError
exception frequently, it means that incoming data is too long. If such a long line is unexpected incoming data and want to ignore it, then set a smaller value than chunk_limit_size
in <buffer>
section.
multiline_flush_interval
time
nil (disabled)
0.14.0
The interval of flushing the buffer for multiline format.
If you set multiline_flush_interval 5s
, in_tail
flushes buffered event after 5 seconds from last emit. This option is useful when you use format_firstline
option.
pos_file
(highly recommended)string
nil
0.14.0
Fluentd will record the position it last read from this file:
pos_file
handles multiple positions in one file so no need to have multiple pos_file
parameters per source
.
Don't share pos_file
between in_tail
configurations. It causes unexpected behavior e.g. corrupt pos_file
content.
in_tail
removes the untracked file position at startup. It means that the content of pos_file
keeps growing until a restart when you tail lots of files with the dynamic path setting. This issue can be solved by using pos_file_compaction_interval
.
pos_file_compaction_interval
time
nil
1.9.2
The interval of doing compaction of pos file.
The targets of compaction are unwatched, unparsable, and the duplicated line. You can use this value when pos_file
option is set:
<parse>
Directive (required)The format of the log.
in_tail
uses the parser plugin to parse the log. See parser
for more detail.
Examples:
If @type
contains multiline
, in_tail
works in multiline mode.
format
Deprecated parameter. Use <parse>
instead.
path_key
string
nil (no assign)
0.14.0
Adds the watching file path to the path_key
field.
With this configuration:
The generated events are like this:
rotate_wait
time
5 (seconds)
0.14.0
in_tail
actually does a bit more than tail -F
itself. When rotating a file, some data may still need to be written to the old file as opposed to the new one.
in_tail
takes care of this by keeping a reference to the old file (even after it has been rotated) for some time before transitioning completely to the new file. This helps prevent data designated for the old file from getting lost. By default, this time interval is 5 seconds.
The rotate_wait
parameter accepts a single integer representing the number of seconds you want this time interval to be.
enable_watch_timer
bool
true
0.14.0
Enables the additional watch timer. Setting this parameter to false
will significantly reduce CPU and I/O consumption when tailing a large number of files on systems with inotify
support. The default is true
which results in an additional 1 second timer being used.
in_tail
(via Cool.io
) uses inotify
on systems which support it. Earlier versions of libev
on some platforms (e.g. macOS) did not work properly; therefore, an explicit 1 second timer was used. Even on systems with inotify
support, this results in additional I/O each second, for every file being tailed.
Early testing demonstrates that modern Cool.io
and in_tail
work properly without the additional watch timer. In the future, depending on the feedback and testing, the additional watch timer may be disabled by default.
enable_stat_watcher
bool
true
1.0.1
Enables the additional inotify
-based watcher. Setting this parameter to false
will disable the inotify
events and use only timer watcher for file tailing.
This option is mainly for avoiding the stuck issue with inotify
.
open_on_every_update
bool
false
0.14.12
Opens and closes the file on every update instead of leaving it open until it gets rotated.
emit_unmatched_lines
bool
false
0.14.12
Emits unmatched lines when <parse>
format is not matched for incoming logs.
Emitted record is {"unmatched_line" : incoming line}
, e.g. {"unmatched_line" : "Non JSON format!"}
.
ignore_repeated_permission_error
bool
false
0.14.0
If you have to exclude the non-permission files from the watch list, set this parameter to true
. It suppresses the repeated permission error logs.
@log_level
The @log_level
option allows the user to set different levels of logging for each plugin. The supported log levels are: fatal
, error
, warn
, info
, debug
, and trace
.
Refer to the Logging for more details.
<group>
SectionThe in_tail
plugin can assign each log file to a group, based on user defined rules. The limit
parameter controls the total number of lines collected for a group within a rate_period
time interval.
Example:
pattern
regexp
/^\/var\/log\/containers\/(?<podname>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\/[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace>[^_]+)_(?<container>.+)-(?<docker_id>[a-z0-9]{64})\.log$/
1.15
Specifies the regular expression for extracting metadata (namespace, podname) from log file path. Default value of the pattern regexp extracts information about namespace
, podname
, docker_id
, container
of the log (K8s specific).
You can also add custom named captures in pattern
for custom grouping of log files. For example,
In this example, filename will be extracted and used to form groups.
rate_period
time
60 (seconds)
1.15
Time period in which the group line limit is applied. in_tail
resets the counter after every rate_period
interval.
<rule>
Section (required)Grouping rules for log files.
match
hash
{"namespace": "/./", "podname": "/./"}
1.15
match
parameter is used to check if a file belongs to a particular group based on hash keys (named captures from pattern
) and hash values (regexp in string)
limit
integer
-1
1.15
Maximum number of lines allowed from a group in rate_period
time interval. The default value of -1
doesn't throttle log files of that group.
<parse>
type is not matched for logs?in_tail
prints warning message. For example, if you specify @type json
in <parse>
and your log line is 123,456,str,true
, then you will see following message in fluentd logs:
See also emit_unmatched_lines
parameter.
in_tail
doesn't start to read the log file, why?in_tail
follows tail -F
command's behavior by default, so in_tail
reads only the new logs. If you want to read the existing lines for the batch use case, set read_from_head true
.
in_tail
shows /path/to/file unreadable
log message. Why?If you see this message:
/path/to/file
unreadable. It is excluded and would be examined next time.
It means that fluentd
does not have read permission for /path/to/file
. Check your fluentd and target files permission.
Note: When td-agent
is launched by systemd, the default user of the td-agent
process is the td-agent
user. You must ensure that this user has read permission to the tailed /path/to/file
. For instance, on Ubuntu, the default Nginx access file /var/log/nginx/access.log
is mode 0640
and owned by www-data:adm
. In this case, several options are available to allow read access:
Add the td-agent
user to the adm
group, e.g. through usermod -aG
, or
Use the cap_dac_read_search
capability to allow the invoking user to read the file without otherwise changing its permission bits or ownership.
A bug exists in Fluentd 1.13.x where it may suppress warning logs about unreadable files. (See Fluentd PR #3478.)
logrotate
Settinglogrotate
has the nocreate
parameter and it does not create a new file if log rotation is triggered. It means in_tail
cannot find the new file to tail.
This parameter does not fit the typical application log use cases, so check your logrotate
setting which does not include the nocreate
parameter.
in_tail
receives BufferOverflowError
?in_tail
stops reading the new lines and pos file updates until BufferOverflowError
is resolved. After resolving BufferOverflowError
, resume emitting new lines and pos file updates.
in_tail
is sometimes stopped when monitor lots of files. How to avoid it?Try to set enable_stat_watcher false
in in_tail
setting. We got several reports that in_tail
is stopped when *
is included in path
, and the problem is resolved by disabling the inotify
events.
path
does not work on Windows, why?The backslash (\
) with *
does not work on Windows by internal limitations. To avoid this, use slash style instead:
If this article is incorrect or outdated, or omits critical information, please let us know. Fluentd is an open-source project under Cloud Native Computing Foundation (CNCF). All components are available under the Apache 2 License.
Example,
In this case, rules with more constraints, i.e., greater number of match
hash keys will be given a higher priority. So a file will be assigned to Rule2
if it can be assigned to both Rule1
and Rule2
.