Versions | v0.12 (td-agent2)

regexp Parser Plugin

The regexp parser plugin parses logs by given regexp pattern. If the parameter value starts and ends with “/”, it is considered to be a regexp. The regexp must have at least one named capture (?<NAME>PATTERN). If the regexp has a capture named time, this is configurable, it is used as the time of the event. You can specify the time format using the time_format parameter.

format /.../ # regexp parser is used
format json  # json parser is used

Table of Contents

Parameters

time_key

Specify the field for event time. Default is time.

time_format

Specify time format for time_key.

See Time#strptime for additional format information.

keep_time_key

If you want to keep time field in the record, set true. Default is false.

types

Although every parsed field has type string by default, you can specify other types. This is useful when filtering particular fields numerically or storing data with sensible type information.

The syntax is

types <field_name_1>:<type_name_1>,<field_name_2>:<type_name_2>,...

e.g.,

types user_id:integer,paid:bool,paid_usd_amount:float

As demonstrated above, “,” is used to delimit field-type pairs while “:” is used to separate a field name with its intended type.

Unspecified fields are parsed at the default string type.

The list of supported types are shown below:

  • string
  • bool
  • integer (“int” would NOT work!)
  • float
  • time
  • array

For the time and array types, there is an optional third field after the type name. For the “time” type, you can specify a time format like you would in time_format.

For the “array” type, the third field specifies the delimiter (the default is “,”). For example, if a field called “item_ids” contains the value “3,4,5”, types item_ids:array parses it as [“3”, “4”, “5”]. Alternatively, if the value is “Adam|Alice|Bob”, types item_ids:array:| parses it as [“Adam”, “Alice”, “Bob”].

Example

format /^\[(?<logtime>[^\]]*)\] (?<name>[^ ]*) (?<title>[^ ]*) (?<id>\d*)$/
time_key logtime
time_format %Y-%m-%d %H:%M:%S %z
types id:integer

With this config:

[2013-02-28 12:00:00 +0900] alice engineer 1

This incoming log is parsed as:

time:
1362020400 (22013-02-28 12:00:00 +0900)

record:
{
  "name" : "alice",
  "title": "engineer",
  "id"   : 1
}

FAQ

How to debug my regexp pattern?

fluentd-ui’s in_tail editor helps your regexp testing. Another way, Fluentular is a great website to test your regexp for Fluentd configuration.

NOTE: You may hit Application Error at Fluentular due to heroku free plan limitation. Retry a few hours later or use fluentd-ui instead.

Last updated: 2017-06-28 14:07:24 +0000

Versions | v0.12 (td-agent2)

If this article is incorrect or outdated, or omits critical information, please let us know. Fluentd is a open source project under Cloud Native Computing Foundation (CNCF), originally invented by Treasure Data, Inc. All components are available under the Apache 2 License.

Interested in the Fluentd Newsletters?