exec
The in_exec
Input plugin executes external programs to receive or pull event logs. It will then read TSV (tab separated values), JSON or MessagePack from the stdout of the program.
You can run a program periodically or permanently. To run periodically, please use the run_interval parameter.
Example Configuration
in_exec
is included in Fluentd's core. No additional installation process is required.
Please see the Config File article for the basic structure and syntax of the configuration file.
Parameters
type (required)
The value must be exec
.
command (required)
The command (program) to execute.
format
The format used to map the program output to the incoming event.
The following formats are supported:
tsv (default)
json
msgpack
parser plugin formats, e.g. ltsv, none.
When using the tsv format, please also specify the comma-separated keys
parameter.
When using the json format, this plugin uses the Yajl library to parse the program output. Yajl buffers data internally so the output isn't always instantaneous.
tag (required if tag_key is not specified)
tag of the output events.
tag_key
The key to use as the event tag instead of the value in the event record. If this parameter is not specified, the tag
parameter will be used instead.
time_key
The key to use as the event time instead of the value in the event record. If this parameter is not specified, the current time will be used instead.
time_format
The format of the event time used for the time_key parameter. The default is UNIX time (integer).
run_interval
The interval time between periodic program runs. If no specify value, command script runs only once.
log_level option
The log_level
option allows the user to set different levels of logging for each plugin. The supported log levels are: fatal
, error
, warn
, info
, debug
, and trace
.
Please see the logging article for further details.
Real World Use Case: using in_exec to scrape Hacker News Top Page
If you already have a script that runs periodically (say, via cron
) that you wish to store the output to multiple backend systems (HDFS, AWS, Elasticsearch, etc.), in_exec is a great choice.
The only requirement for the script is that it outputs TSV, JSON or MessagePack.
For example, the following script scrapes the front page of Hacker News and scrapes information about each post:
Suppose that script is called hn.rb
. Then, you can run it every 5 minutes with the following configuration
And if you run Fluentd with it, you will see the following output (if you are impatient, ctrl-C to flush the stdout buffer)
Of course, you can use Fluentd's many output plugins to store the data into various backend systems like Elasticsearch, HDFS, MongoDB, AWS, etc.
If this article is incorrect or outdated, or omits critical information, please let us know. Fluentd is a open source project under Cloud Native Computing Foundation (CNCF). All components are available under the Apache 2 License.
Last updated