Config: Parse Section

Some of the Fluentd plugins support the <parse> section to specify how to parse the raw data.

Parse Section Overview

The parse section can be under <source>, <match> or <filter> section. It is enabled for the plugins that support parser plugin features.

<source>
  @type tail
  # ...
  <parse>
    # ...
  </parse>
</source>

Parser Plugin Type

The @type parameter of <parse> section specifies the type of the parser plugin. Fluentd core bundles some useful parser plugins.

<parse>
  @type apache2
</parse>

Third-party plugins may also be installed and configured.

For more details, see plugins documentation.

Parameters

`@type`

The @type parameter specifies the type of the parser plugin.

<parse>
  @type regexp
  # ...
</parse>

Here's the list of built-in parser plugins:

Parse Parameters

NOTE: It depends on an input plugin's capabilities whether can handle a <parse> section or not. See each plugin's documentation in detail.

The default value of the following parameters will be overridden by the individual parser plugins:

types (hash) (optional): Specify types for converting field into another
type. See below the details of the "types Parameter" section.
- Default: nil
- string-based hash: field1:type, field2:type, field3:type:option, field4:type:option
- JSON format: {"field1":"type", "field2":"type", "field3":"type:option", "field4":"type:option"}
- example: types user_id:integer,paid:bool,paid_usd_amount:float
time_key (string) (optional): Specify time field for event time. If
the event doesn't have this field, current time is used.
- Default: nil
- Note that json,
  ltsv and
  regexp override the default value of this
  parameter and set it to time by default.
null_value_pattern (regexp) (optional): Specify null value pattern.
- Default: nil
null_empty_string (bool) (optional): If true, empty string field is
replaced with nil.
- Default: false
estimate_current_event (bool) (optional): If true, use
Fluent::EventTime.now(current time) as a timestamp when time_key is
specified.
- Default: true
keep_time_key (bool) (optional): If true, keep time field in the
record.
- Default: false
timeout (time) (optional): Specify timeout for parse processing. This
is mainly for detecting wrong regexp pattern.
- Default: nil

`types` Parameter

For the types parameter, the following types are supported:

string: Converts the field into String type. This uses to_s method for conversion.
bool: Converts the string "true", "yes" or "1" into true. Otherwise, false.
integer (not int): Converts the field into the Integer type. This uses to_i method for conversion. For example, the string "1000" converts into 1000.
float: Converts the field into Float type. This uses to_f method for conversion. For example, the string "7.45" converts into 7.45.
time: Converts the field into Fluent::EventTime type. This uses Fluentd time parser for conversion. For the time type, the third field specifies the time format similar to time_format.
```
  date:time:%d/%b/%Y:%H:%M:%S %z # for string with time format
  date:time:unixtime             # for integer time
  date:time:float                # for float time
```
See time_type and time_format parameters in Time parameters section.
array: Converts the string field into Array type. For the array type, the third field specifies the delimiter (the default is comma ","). For example, if a field item_ids contains the value "3,4,5", types item_ids:array parses it as ["3", "4", "5"]. Alternatively, if the value is "Adam|Alice|Bob", types item_ids:array:| parses it as ["Adam", "Alice", "Bob"].

Time Parameters

time_type (enum) (optional): parses/formats value according to this
type
- Default: string
- Available values: float, unixtime, string, mixed
  - float: seconds from Epoch + nano seconds (e.g.
    1510544836.154709804)
  - unixtime: seconds from Epoch (e.g. 1510544815)
  - string: use format specified by time_format, local time or time
    zone
  - mixed: enable time_format_fallbacks option. (Since Fluentd v1.12.2)
time_format (string) (optional): processes value according to the
specified format. This is available only when time_type is string.
- Default: nil
- Available time format:
  - For more details about formatting, see
    Time#strftime.
  - For more details about parsing, see
    Time.strptime.
  - %iso8601 (only for parsing)
  - Use %N to parse/format with sub-second precision, because
    strptime does not support
    %3N, %6N, %9N, and %L.
localtime (bool) (optional): if true, uses local time. Otherwise,
UTC is used. This is exclusive with utc.
- Default: true
utc (bool) (optional): if true, uses UTC. Otherwise, local time is
used. This is exclusive with localtime.
- Default: false
timezone (string) (optional): uses the specified timezone. One can
parse/format the time value in the specified timezone format.
- Default: nil
- Available time zone format:
  1. [+-]HH:MM (e.g. "+09:00") (recommended)
  2. [+-]HHMM (e.g. "+0900")
  3. [+-]HH (e.g. "+09")
  4. Region/Zone (e.g. Asia/Tokyo)
  5. Region/Zone/Zone (e.g. America/Argentina/Buenos_Aires)
time_format_fallbacks () (optional): uses the specified time format as a fallback in the specified order.
You can parse undetermined time format by using time_format_fallbacks. This options is enabled when time_type is mixed.
- Default: nil
```
  time_type mixed
  time_format unixtime
  time_format_fallbacks %iso8601
```
In the above use case, the timestamp is parsed as unixtime at first, if it fails, then it is parsed as %iso8601 secondary. Note that time_format_fallbacks is the last resort to parse mixed timestamp format. There is a performance penalty (Typically, N fallbacks are specified in time_format_fallbacks and if the last specified format is used as a fallback, N times slower in the worst case).

If this article is incorrect or outdated, or omits critical information, please let us know. Fluentd is an open-source project under Cloud Native Computing Foundation (CNCF). All components are available under the Apache 2 License.

PreviousConfig: Common Parameters NextConfig: Buffer Section

Last updated 1 year ago

Was this helpful?