Versions | v0.12 (td-agent2) | v0.10 (td-agent1)

UDP Input Plugin

The in_udp Input plugin enables Fluentd to accept UDP payload.

Table of Contents

Example Configuration

in_udp is included in Fluentd’s core. No additional installation process is required.

<source>
  @type udp
  tag mytag # required
  format /^(?<field1>\d+):(?<field2>\w+)$/ # required
  port 20001 # optional. 5160 by default
  bind 0.0.0.0 # optional. 0.0.0.0 by default
  body_size_limit 1MB # optional. 4096 bytes by default
</source>
Please see the Config File article for the basic structure and syntax of the configuration file.

Parameters

type (required)

The value must be udp.

tag (required)

tag of output events.

port

The port to listen to. Default Value = 5160

bind

The bind address to listen to. Default Value = 0.0.0.0

delimiter

The payload is read up to this character. By default, it is “\n”.

source_host_key

The field name of the client’s hostname. If set the value, the client’s hostname will be set to its key. The default is nil (no adding hostname).

If you set following configuration:

source_host_key client_host

then the client’s hostname is set to client_host field.

{
    ...
    "foo": "bar",
    "client_host": "client.hostname.org"
}

format (required)

The format of the UDP payload.

  • regexp

The regexp for the format parameter can be specified. If the parameter value starts and ends with “/”, it is considered to be a regexp. The regexp must have at least one named capture (?<NAME>PATTERN). If the regexp has a capture named ‘time’, it is used as the time of the event. You can specify the time format using the time_format parameter.

fluentd-ui’s in_tail editor helps your regexp testing. Another way, Fluentular is a great website to test your regexp for Fluentd configuration.

You may hit Application Error at Fluentular due to heroku free plan limitation. Retry a few hours later or use fluentd-ui instead.
  • apache2

Reads apache’s log file for the following fields: host, user, time, method, path, code, size, referer and agent. This template is analogous to the following configuration:

format /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/
time_format %d/%b/%Y:%H:%M:%S %z
  • apache_error

Reads apache’s error log file for the following fields: time, level, pid, client and (error) message. This template is analogous to the following configuration:

format /^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])? \[client (?<client>[^\]]*)\] (?<message>.*)$/
  • nginx

Reads Nginx’s log file for the following fields: remote, user, time, method, path, code, size, referer and agent. This template is analogous to the following configuration:

format /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/
time_format %d/%b/%Y:%H:%M:%S %z
  • syslog

Reads syslog’s output file (e.g. /var/log/syslog) for the following fields: time, host, ident, and message. This template is analogous to the following configuration:

format /^(?<time>[^ ]*\s*[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/
time_format %b %d %H:%M:%S
  • tsv or csv

If you use tsv or csv format, please also specify the keys parameter.

format tsv
keys key1,key2,key3
time_key key2

If you specify the time_key parameter, it will be used to identify the timestamp of the record. The timestamp when Fluentd reads the record is used by default.

format csv
keys key1,key2,key3
time_key key3
keys parameter is ',' separated value and you should not add spaces between `,`. keys parameter keeps any characters. For example, "key1,key2" are parsed as ["key1", "key2"] but "key1, key2" are parsed as ["key1", " key2"]. We will support json array based parameter to avoid this problem.
  • ltsv

ltsv (Labeled Tab-Separated Value) is a tab-delimited key-value pair format. You can learn more about it on its webpage.

format ltsv
delimiter =               # Optional. ':' is used by default
time_key time_field_name

If you specify the time_key parameter, it will be used to identify the timestamp of the record. The timestamp when Fluentd reads the record is used by default.

  • json

One JSON map, per line. This is the most straight forward format :).

format json

The time_key parameter can also be specified. The default is ‘time’ and if there is no time field in the record, json parser uses current time as an event time.

format json
time_key key3

Without time_format, json parser assumes time field is a second integer.

  • none

You can use the none format to defer parsing/structuring the data. This will parse the line as-is with the key name “message”. For example, if you had a line

hello world. I am a line of log!

It will be parsed as

{"message":"hello world. I am a line of log!"}

The key field is “message” by default, but you can specify a different value using the message_key parameter as shown below:

format none
message_key my_message
  • multiline

Read multiline log with formatN and format_firstline parameters. format_firstline is for detecting start line of multiline log. formatN, N’s range is 1..20, is the list of Regexp format for multiline log. Here is Rails log example:

format multiline
format_firstline /^Started/
format1 /Started (?<method>[^ ]+) "(?<path>[^"]+)" for (?<host>[^ ]+) at (?<time>[^ ]+ [^ ]+ [^ ]+)\n/
format2 /Processing by (?<controller>[^\u0023]+)\u0023(?<controller_method>[^ ]+) as (?<format>[^ ]+?)\n/
format3 /(  Parameters: (?<parameters>[^ ]+)\n)?/
format4 /  Rendered (?<template>[^ ]+) within (?<layout>.+) \([\d\.]+ms\)\n/
format5 /Completed (?<code>[^ ]+) [^ ]+ in (?<runtime>[\d\.]+)ms \(Views: (?<view_runtime>[\d\.]+)ms \| ActiveRecord: (?<ar_runtime>[\d\.]+)ms\)/

If you have a multiline log

Started GET "/users/123/" for 127.0.0.1 at 2013-06-14 12:00:11 +0900
Processing by UsersController#show as HTML
  Parameters: {"user_id"=>"123"}
  Rendered users/show.html.erb within layouts/application (0.3ms)
Completed 200 OK in 4ms (Views: 3.2ms | ActiveRecord: 0.0ms)

It will be parsed as

{"method":"GET","path":"/users/123/","host":"127.0.0.1","controller":"UsersController","controller_method":"show","format":"HTML","parameters":"{ \"user_id\" = >\"123\"}", ...}

One more example, you can parse Java like stacktrace logs with multiline. Here is a configuration example.

format multiline
format_firstline /\d{4}-\d{1,2}-\d{1,2}/
format1 /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?<thread>.*)\] (?<level>[^\s]+)(?<message>.*)/

If you have a following log:

2013-3-03 14:27:33 [main] INFO  Main - Start
2013-3-03 14:27:33 [main] ERROR Main - Exception
javax.management.RuntimeErrorException: null
    at Main.main(Main.java:16) ~[bin/:na]
2013-3-03 14:27:33 [main] INFO  Main - End

It will be parsed as:

2013-03-03 14:27:33 +0900 zimbra.mailbox: {"thread":"main","level":"INFO","message":"  Main - Start"}
2013-03-03 14:27:33 +0900 zimbra.mailbox: {"thread":"main","level":"ERROR","message":" Main - Exception\njavax.management.RuntimeErrorException: null\n    at Main.main(Main.java:16) ~[bin/:na]"}
2013-03-03 14:27:33 +0900 zimbra.mailbox: {"thread":"main","level":"INFO","message":"  Main - End"}
With `format_firstline`, in_tail delays record emit until next `format_firstline` matched because in_tail can't judge multiline logs are ended or not without `format_firstline` trigger. If your regexps represent log pattern correctly like above Rails example, you may remove `format_firstline` for emitting records immediately.
`multiline` works with only in_tail plugin.

keep_time_key

Parser removes time field from event record by default. If you want to keep time field in record, set true to keep_time_key. Default is false.

log_level option

The log_level option allows the user to set different levels of logging for each plugin. The supported log levels are: fatal, error, warn, info, debug, and trace.

Please see the logging article for further details.

FAQ

How to prevent request drop?

If in_udp gots lots of packets within 1 sec, some packets are dropped. For example, you can see bigger RcvbufErrors number via netstat -su.

This means in_udp with one process can’t handle such traffic. Try fluent-plugin-multiprocess to resolve the problem. See issue 1334 for more detail.

Last updated: 2016-12-19 03:25:48 UTC

Versions | v0.12 (td-agent2) | v0.10 (td-agent1)

If this article is incorrect or outdated, or omits critical information, please let us know. Fluentd is a open source project under Cloud Native Computing Foundation (CNCF), invented and sponsored by Treasure Data, Inc. under the Apache 2.0 Licence.

Interested in the Fluentd Newsletters?