Getting Started with ElasticSearch

Getting Started with ElasticSearch

We're currently focusing on improving our ELK (Elasticsearch, Logstash & Kibana) monitoring service & the reporting around error levels. This has given me the perfect opportunity to learn a bit more about Elasticsearch and play about with it's rest apis.

Elasticsearch is a search and analytics engine built on top of Lucene that exposes a RESTful interface for CRUD operations against json documents. At JUST EAT we use it for a number of resilient searching solutions, including the storage of all logs generated by our various applications.

In the ELK stack Logstash is used for configurable ingestion of logs from multiple sources, Elasticsearch acts as the data store & Kibana sits on top exposing a powerful UI for querying & aggregation.

Downloading & Running the Elasticsearch Service

Firstly download Elasticsearch here and unzip it. There's no installation needed here, which is nice, but you do require Java.

Then

  1. Make sure you've got the latest JDK installed
  2. Open Powershell in the directory Elasticsearch was unzipped to
  3. Run .\bin\elasticsearch.bat
  4. Go to http://127.0.0.1:9200 in your browser to check it is online

If it's all running fine you should see something like this:

{
  "name" : "PcD6pEJ",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "mMvo75HtSIqg4iOB9Auxjw",
  "version" : {
    "number" : "5.0.0",
    "build_hash" : "253032b",
    "build_date" : "2016-10-26T04:37:51.531Z",
    "build_snapshot" : false,
    "lucene_version" : "6.2.0"
  },
  "tagline" : "You Know, for Search"
}

Inserting Data

New data is inserted like so; this will create a new index called "logs" with a type of "sample" and auto generate an id for the entry (stored in the _id field for you).

Invoke-WebRequest -Method POST `
                  -Uri 'http://127.0.0.1:9200/logs/sample' `
                  -Body '{"Message": "Something happened", "username": "bob"}'

Data can also be PUT into a collection; in this case the identifier of this entry is "4". If re-run this will overwrite the data at this point.

Invoke-WebRequest -Method PUT `
                  -Uri 'http://127.0.0.1:9200/logs/sample/4' `
                  -Body '{"Message": "Something happened", "username": "bob"}'

Getting the Data Out

This will get a single entry out of the database by it's identifier

Invoke-WebRequest -Method GET -Uri 'http://127.0.0.1:9200/logs/sample/4'

The _search endpoint can be used to search for lists of values. This example does a text search for "Something". Elasticsearch is built on Lucene so words are tokenised.

Invoke-WebRequest -Method GET -Uri 'http://127.0.0.1:9200/_search?q=Something'

This example does a wildcard search for "happen"

Invoke-WebRequest -Method GET -Uri 'http://127.0.0.1:9200/_search?q=*Happen*'

This example searches for records with key "username" and value "bob"

Invoke-WebRequest -Method GET -Uri 'http://127.0.0.1:9200/_search?q=username:bob'

Searches can be performed across all indexes in Elasticsearch (like above) or against specific indexes like so:

Invoke-WebRequest -Method GET -Uri 'http://127.0.0.1:9200/logs/_search?q=*Happen*'

Deleting Data

Data can be deleted by identifier like this:

Invoke-WebRequest -Method DELETE -Uri 'http://127.0.0.1:9200/logs/sample/4'

It can also be deleted by whole index like this:

Invoke-WebRequest -Method DELETE -Uri 'http://127.0.0.1:9200/logs'

Next Steps

I was pleasantly surprised to learn that Elasticsearch supports some pretty serious aggregation as part of it's language. This is how Kibana is able to serve complex graphs over vast quantities of data within a reasonable response time. There's a detailed blog post here that goes into this in detail.