An instance of an Elasticsearch is a node and a collection of nodes is called a cluster.
All nodes know about all the other nodes in the cluster and can forward client requests to the appropriate node. Besides that, each node serves one or more purpose:
- Master node
- Data node
- Client node
- Tribe node
1. Master node
Master node controls the cluster. Any master-eligible node (all nodes by default) may be elected to become the master node by the master election process. The master node is responsible for lightweight cluster-wide actions such as creating or deleting an index, tracking which nodes are part of the cluster, and deciding which shards to allocate to which nodes. It is important for cluster health to have a stable master node.
Indexing and searching your data is CPU-, memory-, and I/O-intensive work which can put pressure on a node’s resources. To ensure that your master node is stable and not under pressure, it is a good idea in a bigger cluster to split the roles between dedicated master-eligible nodes and dedicated data nodes.
While master nodes can also behave as coordinating nodes and route search and indexing requests from clients to data nodes, it is better not to use dedicated master nodes for this purpose. It is important for the stability of the cluster that master-eligible nodes do as little work as possible.
To create a standalone master-eligible node, set:
2. Data node
Data nodes hold the shards that contain the documents you have indexed. It also performs data related operations such as CRUD, search, and aggregations. These operations are I/O-, memory-, and CPU-intensive. It is important to monitor these resources and to add more data nodes if they are overloaded.
To create a standalone data node, set:
3. Client node
If you take away the ability to be able to handle master duties and take away the ability to hold data, then you are left with a client node that can only route requests, handle the search reduce phase, and distribute bulk indexing. Client node neither hold data nor become the master node. It behaves as a “smart router” and is used to forward cluster-level requests to master node and data-related requests(such as search) to the appropriate data nodes. Requests like search requests or bulk-indexing requests may involve data held on different data nodes.
A search request, for example, is executed in two phases which are coordinated by the node which receives the client request – the coordinating node:
- In the scatter phase, the coordinating node forwards the request to the data nodes which hold the data. Each data node executes the request locally and returns its results to the coordinating node.
- In the gather phase, the coordinating node reduces each data node’s results into a single global resultset.
This means that a client node needs to have enough memory and CPU in order to deal with the gather phase.
Standalone client nodes can benefit large clusters by offloading the coordinating node role from data and master-eligible nodes. Client nodes join the cluster and receive the full cluster state, like every other node, and they use the cluster state to route requests directly to the appropriate place(s).
Adding too many client nodes to a cluster can increase the burden on the entire cluster because the elected master node must await acknowledgement of cluster state updates from every node! The benefit of client nodes should not be overstated — data nodes can happily serve the same purpose as client nodes.
To create a client node, set:
4. Tribe node
A tribe node, configured via the tribe.* settings, is a special type of client node that can connect to multiple clusters and perform search and other operations across all connected clusters.
Avoiding split brain with minimum_master_nodes:
minimum_master_nodes setting can be used to avoid split brain. Split brain occurs where the cluster is divided into two smaller cluster due to network partition or other issue and both individual smaller cluster now selects its own individual masters hence having more than one master at a time.
To elect a master a quorum of master-eligible nodes is required: (master_eligible_nodes / 2) + 1
So, in a cluster with 3 master-eligible nodes, the minimum quorum required is (3 / 2) + 1 = 2.
Now, if due to network partition, there are two smaller clusters with 2 master-eligible nodes available in 1st cluster and 3rd master-eligible node in 2nd cluster, since 2nd cluster does not have the required quorom of 2 it cannot select a new master.
Hence, if there are three master-eligible nodes, then minimum mater nodes should be set to 2.
Suggested deployment for a small/medium size cluster (12-20 nodes) deployed in 4 different clouds:
1. 3 master node medium computes one in 3 different cloud (This is to ensure high availability if entire cloud goes down).
2. 2 client node large/extra large computes one in each cloud, again for availability reason.
3. Remaining large/extra large computes distributed equally among all clouds.