Elasticsearch is a database tool that runs on a browser for querying written words. It performs some other tasks at its core. It is designed for wading through text and returns the same text as given query. Elasticsearch is a scalable and cluster supporting database tool.
Modifying the data and logs directory These changes can be done in the elasticsearch.yaml file. These changes are optional, you can leave the defaults that they come with. I prefer having all data stored at one location and logs within the install folder for easy debugging. You can use Homebrew’s simple ctl to brew install Elasticsearch on Mac OS, as well as Kibana and the rest ELK Stack. A new official Homebrew tap developed by Elastic makes this procedure super easy. More on the subject: How to Overcome the Drawbacks of SIEM Tools.
More specifically, it is a database server, which is written in Java. It takes data from users and stores it in document form. Elasticsearch allows us to perform various types of searches on data stored inside it, such as - searching for any database by description, searching through posts on a blog, and as well as finding the same text in a body of crawled webpages.
Below are some points of elasticsearch -
- Elasticsearch, NoSQL database is based on the Apache Lucene search engine.
- It is very convenient to work with elasticsearch because its main protocol is implemented with JSON/HTTP.
- Elasticsearch is commendable for storing the data and performing operations on them.
- It is an excellent tool when confronted with the task of cutting through semi-structured muck in the natural language.
Elasticsearch For Macro
Elasticsearch is Lucene
Elasticsearch is primarily based on another software project called Lucene. Basically, elasticsearch is based on the Lucene search engine. Therefore, it has an important place in elasticsearch. Below some points are discussed to understand that Elasticsearch is Lucene much better -
- It is easy to understand the elasticsearch as a piece of infrastructure built around Java libraries of Lucene.
- Lucene is a very popular, proven, and tested search engine that is widely used.
- Elasticsearch itself offers a more concise and useable API to its users. It also provides scalability and operational tool on the top of Lucene's implementation.
- A large number of companies use Apache Lucene.
- Lucene is considered best-of-breed in open-search software.
The Value Add
- As we all know that Lucene is an excellent tool, but on the other side it is cumbersome to use directly.
- Elasticsearch offers simple and intuitive API than the Lucene Java API.
- It provides an infrastructure that makes the scaling across the data centres and machines relatively simple.
- Below are some features that the elasticsearch brings comparing to Lucene.
- Simple API
- Clustering and replication
- Interoperation with non-Java
- Operational ease of use, etc.
Problems solved by Elasticsearch
As every tool comes with features to solve the problems that are not solved by other tools. Similarly, elasticsearch also has some features that resolve the problem. Elasticsearch is very useful in various cases and solved many problems that occur in other database search engines. Following are few tasks for which the Elasticsearch engine is well suited.
- To search the large number of products using a particular phrase or text string (like chef's knife) and returns the best result for it.
- As per the previous example for searching the chef's knife, it breaks down several departments and displays various sections where it appears.
- In search box, auto-completion feature solves the problem of a partially typed word based on the previously issued searches.
- Elasticsearch is well suited for storing a large amount of JSON data (semi-structured data) in a distributed manner. Along with this, it also specifies the redundancy level for the data across a cluster of machines.
Elasticsearch is well suited for solving the such types of problems mentioned above. However, it is not a best choice for some other problems like for which the relation database (RDBMS) is optimized. Some of these problems are discussed below:
- It is not a good choice to calculate the number of items left in inventory.
- It does not provide rollback support and bad to execute two operations transactionally.
- While finding the sum of all line-items on all invoices which are sent out in a given month.
- To create multiple records which guaranteed to be unique in various given terms. For example - mobile number
Elasticsearch For Mac Os
Elasticsearch is very useful to provide approximate answers from data. E.g., scoring the results by quality. It can also perform matching and statistical calculations. The traditional databases such as RDBMS (relational database) provides precision and data integrity, whereas Lucene and elasticsearch have some provisions.
Up and Running
Since Elasticsearch is a database tool that runs on a search engine. It can be installed and run on almost every platform like windows, Linux, or Mac, etc. In the earlier tutorial, we already discussed the installation, uses, and the tools that need to be set up for elasticsearch on window platform.
Now, let's have a brief discussion to setup the tools on some other platform like windows, linux, Mac, which we may require us to work with elasticsearch. Below are the basic steps which are common to all platforms -
- First of all, you need to install Java and set its path on all the platforms.
- By executing java -version command, you can verify that Java has installed on your system.
- After that, you can install the elasticsearch app system that runs through the command line by executing the elasticsearch batch file (.exe file).
- This batch file does not provide an interactive user interface. Therefore, you need to add or install the plugin for it.
These are the simple steps to install the elasticsearch. There are some steps for different platforms separately -
Linux
- In case if you are using the Linux operating system, we will recommend you to download the .rpm or .deb version of elasticsearch database tool.
- Once it downloads successfully, extracts it to unzip the files.
- After extracting the files, you can run the batch file of elasticsearch (elasticsearch.bat) to start the server as a daemon in the background.
Mac OS
- For the Mac operating system, we will recommend you to download the zip setup of elasticsearch tool.
- Once it is downloaded, unzip it into a folder.
- Mac OS has a terminal to run the command similar to the window's command prompt.
- Navigate to the bin directory and run the executable file of elasticsearch on the terminal by passing the command ./elasticsearch.
Microsoft Window
For the Microsoft window operating system, we have already discussed in detail in an earlier chapter of this tutorial. Let's take an overview of it.
- For windows operating system, download the .zip package of elasticsearch and extract the files from it.
- Run the elasticsearch.bat (executable file) file that exists inside the bin folder.
- After running the elasticsearch executable file, we will check whether the server is running or not.
Verify that server is running
After installing elasticsearch on your preferable operating system and executing the batch file, now it's time to verify whether the server is running or not without any error. So, for this we need to open the web browser.
- Open any web browser like Chrome, IE, or Mozilla Firefox.
- Type the localhost:9200 in the URL bar of your browser.
- If you get the same output as the below screenshot, it means the server is running successfully without any fault.
Screenshot
Loading Datasets
Elasticsearch Machine Learning Tutorial
- Elasticsearch offers some pre-defined code samples that depend on the underlying dataset to be loaded into your elasticsearch server.
- There are always some changes that keep in process for this repository. So, always be updated.
- After the repository has cloned, it enables you to load the examples into the server by providing the address of your elasticsearch server, executing elastic-loader.jar program, and path to data-file.
- Whenever you need to load a dataset like movie_db dataset, open the command prompt navigated to the ee-datasets folder and run.
Running Examples
This chapter is all about the exclusive use of the most popular API of elasticsearch, i.e., JSONHTTP. All the examples we have discussed and implemented in this tutorial are an independent description of HTTP requests.
- You are free to use any plugin to interact with elasticsearch. We recommend you to use the elasticsearch-head plugin as we perform all the queries on this plugin in the whole tutorial.
- The elasticsearch-head plugin is available as an extension in the google app store. You can add it to your browser. It is simple and easy to use.
- There are some other tools or plugins, such as analysis-icu, Kibana, and many more. These tools also work well with elasticsearch.
- We have learned HTTP operations (GET, POST, DELETE, PUT, HEAD), request body, and URL path.
This chapter is all about the exploring of elasticsearch.
The installation matrix for the ELK Stack (Elasticsearch, Logstash and Kibana) is extremely varied, with Linux, Windows and Docker all being supported. For development purposes, installing the stack on Mac OS X is a more frequent scenario.
Without further adieu, let’s get down to business.
Installing Homebrew
To install the stack on Mac you can download a .zip or tar.gz package. This tutorial, however, uses Homebrew to handle the installation.
Make sure you have it installed. If not, you can use the following command in your terminal:
If you already have Homebrew installed, please make sure it’s updated:
Installing Java
The ELK Stack requires Java 8 to be installed.
To verify what version of Java you have, use:
To install Java 8 go here.
Installing Elasticsearch
Now that we’ve made sure our system and environment have the required pieces in place, we can begin with installing the stack’s components, starting with Elasticsearch:
Start Elasticsearch with Homebrew:
Use your favorite browser to check that it is running correctly on localhost and the default port: http://localhost:9200
The output should look something like this:
Installing Logstash
Your next step is to install Logstash:
You can run Logstash using the following command:
Since we haven’t configured a Logstash pipeline yet, starting Logstash will not result in anything meaningful. We will return to configuring Logstash in another step below.
Installing Kibana
Finally, let’s install the last component of ELK – Kibana.
Start Kibana and check that all of ELK services are running.
Kibana will need some configuration changes to work.
Open the Kibana configuration file: kibana.yml
Uncomment the directives for defining the Kibana port and Elasticsearch instance:
If everything went well, open Kibana at http://localhost:5601/status. You should see something like this:
Congratulations, you’ve successfully installed ELK on your Mac!
Since this is a vanilla installation, you have no Elasticsearch indices to analyze in Kibana. We will take care of that in the next step.
Shipping some data
You are ready to start sending data into Elasticsearch and enjoy all the goodness that the stack offers. To help you get started, here is an example of a Logstash pipeline sending syslog logs into the stack.
First, you will need to create a new Logstash configuration file:
Enter the following configuration:
Then, restart the Logstash service:
In the Managementtab in Kibana, you should see a newly created “syslog-demo” index created by the new Logstash pipeline.
Enter it as an index pattern, and in the next step select the @timestamp field as your Time Filter field name.
And…you’re all set! Open the Discover page and you’ll see syslog data in Kibana.