Taming filebeat on Elasticsearch (part 1)

This is a multi-part series on using filebeat to ingest data into Elasticsearch. Prior to Elasticsearch 5.X, Logstash is one of tools to ingest data; in 5.X Elastic introduces another option called “beats”. This series will focus on integrating filebeat with ingest node.

the pick…

It is a common question asking which tool to pick, Logstash or filebeat? The following summarised their functionalities:

Logstash filebeat
data shipper* check-mark-1292787_1280 check-mark-1292787_1280
data enricher** check-mark-1292787_1280
native executable***   check-mark-1292787_1280
* data shipper means ingesting data from various sources (e.g. server logs, application logs, IoT devices etc)
** data enricher means enriching the source data (e.g. IP addresses could be enriched by retrieving its geographical information such as the country and city the IP originates)
*** native executable means it does not need any runtime or interpretation engine to get the program run

so if you are NOT going to enrich the source data – filebeat is your natural pick. If you NEED some enrichment or pre-processing on the source data, Logstash could be your pick.

Even though filebeat could not enrich the source data, it is possible to achieve so by configuring ingest-node pipelines; this series will go through a use case on ingesting stock quotes into Elasticsearch plainly using filebeat and ingest-node.

installation

Elasticsearch 5.x

  1. download from elastic.co
  2. copy and unzip the archive in your favourite location
  3. start es5
    screen-shot-2017-01-04-at-7-48-40-pm
  4. open a browser and verify if es5 is running; http://localhost:9200
    screen-shot-2017-01-04-at-7-55-16-pm

filebeat 5.x

  1. download from elastic
  2. pick the archive based on the OS you are going to run filebeat (e.g. windows 64 bit environment)
  3. copy and unzip / install the archive

the stock quotes data files

I have prepared the data files (csv) on github: https://github.com/logmonster/blog/tree/master/dataset/stocks. The following data files are required for our first experiment

  • alphabet-2016.csv – Google
  • amazon-2016.csv – Amazon
  • ibm-2016.csv – IBM
  • netflix-2016.csv – Netflix
  • tesla-2016.csv – Tesla

the first experiment

goal ingest a data file and output to command console
file alphabet-2016.csv
steps
  1. cd {filebeat-HOME}
  2. make a copy of the filebeat.yml in case you need to revert changes later on (this is the config file for running filebeat)
  3. now edit the filebeat.yml file using your favourite editor
    navigate to the line “paths:” under the “filebeat.prospectors:” section
    set the value to the absolute path of the data filescreen-shot-2017-01-05-at-10-58-22-am
  4. to execute filebeat =>
    ./filebeat -e -c {filebeat-config-file.yml} -d "publish"Screen Shot 2017-01-05 at 11.19.17 AM.png
  5. you can see the last sentence “DBG Events sent: 256”; which means that 256 rows of data has been processed

if you notice the command running filebeat, there are a couple of switches / parameters such as “-e” “-c” “-d” etc. The “-e” switch is to enable the “logging to stderr”; “-c” switch is to state which config file to use, by default the “filebeat.yml” is picked if nothing is supplied; “-d” is to pick which log level to log out, in our experiment we want to see the debug logs related to “publish”. If you want to know all the possible switches available, execute this command =>
./filebeat --help

cool~ we’ve made it! Our first experiment on ingesting a data file (csv) using filebeat! Next, we will see how we could pass the ingested data / json into Elasticsearch, stay tune~

End of part 1
part 2 – a taste of data ingestion
part 3 – final ingestion and visulization of your data

Advertisements

2 thoughts on “Taming filebeat on Elasticsearch (part 1)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s