exec

The
in_exec
Input plugin executes external programs to receive or pull event logs. It will then read TSV (tab separated values), JSON or MessagePack from the stdout of the program.You can run a program periodically or permanently. To run periodically, please use the run_interval parameter.
in_exec
is included in Fluentd's core. No additional installation process is required.<source>
@type exec
command cmd arg arg
keys k1,k2,k3
tag_key k1
time_key k2
time_format %Y-%m-%d %H:%M:%S
run_interval 10s
</source>
The value must be
exec
.The command (program) to execute.
The format used to map the program output to the incoming event.
The following formats are supported:
- tsv (default)
- json
- msgpack
When using the tsv format, please also specify the comma-separated
keys
parameter.keys k1,k2,k3
When using the json format, this plugin uses the Yajl library to parse the program output. Yajl buffers data internally so the output isn't always instantaneous.
tag of the output events.
The key to use as the event tag instead of the value in the event record. If this parameter is not specified, the
tag
parameter will be used instead.The key to use as the event time instead of the value in the event record. If this parameter is not specified, the current time will be used instead.
The format of the event time used for the time_key parameter. The default is UNIX time (integer).
The interval time between periodic program runs. If no specify value, command script runs only once.
The
log_level
option allows the user to set different levels of logging for each plugin. The supported log levels are: fatal
, error
, warn
, info
, debug
, and trace
.If you already have a script that runs periodically (say, via
cron
) that you wish to store the output to multiple backend systems (HDFS, AWS, Elasticsearch, etc.), in_exec is a great choice.The only requirement for the script is that it outputs TSV, JSON or MessagePack.
For example, the following script scrapes the front page of Hacker News and scrapes information about each post:
Suppose that script is called
hn.rb
. Then, you can run it every 5 minutes with the following configuration<source>
@type exec
format json
tag hackernews
command ruby /path/to/hn.rb
run_interval 5m # don't hit HN too frequently!
</source>
<match hackernews>
@type stdout
</match>
And if you run Fluentd with it, you will see the following output (if you are impatient, ctrl-C to flush the stdout buffer)
2014-05-26 21:51:35 +0000 hackernews: {"time":1401141095,"rank":1,"title":"Rap Genius Co-Founder Moghadam Fired","points":128,"user_name":"obilgic","duration":"2 hours ago ","num_comments":108}
2014-05-26 21:51:35 +0000 hackernews: {"time":1401141095,"rank":2,"title":"Whitewood Under Siege: Wooden Shipping Pallets","points":128,"user_name":"drjohnson","duration":"3 hours ago ","num_comments":20}
2014-05-26 21:51:35 +0000 hackernews: {"time":1401141095,"rank":3,"title":"Organic Cat Litter Chief Suspect In Nuclear Waste Accident","points":55,"user_name":"timr","duration":"2 hours ago ","num_comments":12}
2014-05-26 21:51:35 +0000 hackernews: {"time":1401141095,"rank":4,"title":"Do We Really Know What Makes Us Healthy? (2007)","points":27,"user_name":"gwern","duration":"1 hour ago ","num_comments":9}
Of course, you can use Fluentd's many output plugins to store the data into various backend systems like Elasticsearch, HDFS, MongoDB, AWS, etc.
If this article is incorrect or outdated, or omits critical information, please let us know. Fluentd is a open source project under Cloud Native Computing Foundation (CNCF). All components are available under the Apache 2 License.
Last modified 3yr ago