The in_exec Input plugin executes external programs to receive or pull event logs. It will then read TSV (tab-separated values), JSON or MessagePack from the standard output of the program.
You can run a program periodically or permanently. To run periodically, please use the run_interval parameter.
It is included in Fluentd's core.
Example Configuration
Refer to the Configuration File article for the basic structure and syntax of the configuration file.
Here is a simple example to fetch load average stats on Linux systems. This configuration instructs Fluentd to read /proc/loadavg once per minute and emit the file content as events.
This configuration emits events like this one:
Real World Example: Scrape Hacker News Top Page
If you already have a script that runs periodically (say, via cron) that you wish to store the output to multiple backend systems (HDFS, AWS, Elasticsearch, etc.), in_exec is a great choice.
The only requirement for the script is that it outputs TSV, JSON or MessagePack.
For example, this script scrapes the front page of Hacker News and scrapes information about each post:
Suppose that script is called hn.rb. Then, you can run it every 5 minutes with the following configuration:
And if you run Fluentd with it, you will see the following output (if you are impatient, CTRL+C to flush the stdout buffer):
Of course, you can use Fluentd's many output plugins to store the data into various backend systems like Elasticsearch, HDFS, MongoDB, AWS, etc.
<source>
@type exec
<parse>
@type json
</parse>
tag hackernews
command ruby /path/to/hn.rb
run_interval 5m # don't hit HN too frequently!
</source>
<match hackernews>
@type stdout
</match>
2017-12-08 14:19:33.160567411 +0900 hackernews: {"time":1512710373,"rank":1,"title":"Japan eyes startup visa program","points":160,"user_name":"benguild","duration":"4 hours ago ","num_comments":0,"unique_id":"item?id=15875627","hiring_notice":false}
2017-12-08 14:19:33.160735378 +0900 hackernews: {"time":1512710373,"rank":2,"title":"Bookbinding: A Tutorial","points":46,"user_name":"jstrieb","duration":"2 hours ago ","num_comments":0,"unique_id":"item?id=15876260","hiring_notice":false}
2017-12-08 14:19:33.160769125 +0900 hackernews: {"time":1512710373,"rank":3,"title":"My Quadriplegic Husband and Me","points":92,"user_name":"mooreds","duration":"4 hours ago ","num_comments":0,"unique_id":"item?id=15875772","hiring_notice":false}
2017-12-08 14:19:33.160799115 +0900 hackernews: {"time":1512710373,"rank":4,"title":"Wall Street banks hit pause button on Bitcoin","points":16,"user_name":"tadasv","duration":"1 hour ago ","num_comments":0,"unique_id":"item?id=15876497","hiring_notice":false}
2017-12-08 14:19:33.160824386 +0900 hackernews: {"time":1512710373,"rank":5,"title":"A Spectator Who Threw a Wrench in the Waymo/Uber Lawsuit","points":107,"user_name":"kynthelig","duration":"4 hours ago ","num_comments":0,"unique_id":"item?id=15875685","hiring_notice":false}