Elasticsearch cluster setup with NFS data directory

Elasticsearch cluster setup with NFS data directory

Here you will lean how to setup your ElasticSearch cluster to use NFS as it's data directory.

22 April 2022

elasticsearchnfs

Installing an Elasticsearch Cluster

As always, there are multiple ways of setting up an Elasticsearch cluster. In this case, we will be manually setting up a cluster consisting of one master node and two data nodes, all on Ubuntu 24.04 instances on AWS EC2 running in the same VPC. The security group was configured to enable access from anywhere using SSH and TCP 5601 (Kibana).


Installing Java

Elasticsearch is built on Java and requires at least Java 8 (1.8.0_131 or later) to run. Our first step, therefore, is to install Java 8 on all the nodes in the cluster. Please note that the same version should be installed on all Elasticsearch nodes in the cluster.

Repeat the following steps on all the servers designated for your cluster.

First, update your system's package list:

sudo apt update

Then, install Java with:

sudo apt install -y default-jre

Installing Elasticsearch nodes

Our next step is to install Elasticsearch. As before, repeat the steps in this section on all your servers.

First, you need to add Elastic’s signing key so that the downloaded package can be verified (skip this step if you’ve already installed packages from Elastic):

sudo install -dm 755 /etc/apt/keyrings
curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /etc/apt/keyrings/elastic.gpg

For Debian, we need to then install the apt-transport-https package:

sudo apt install -y apt-transport-https

The next step is to add the repository definition to your system:

echo "deb [signed-by=/etc/apt/keyrings/elastic.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

All that’s left to do is to update your repositories and install Elasticsearch:

sudo apt update
sudo apt install -y elasticsearch

Configuring the Elasticsearch cluster

Our next step is to set up the cluster so that the nodes can connect and communicate with each other. For each node, open the Elasticsearch configuration file:

sudo vim /etc/elasticsearch/elasticsearch.yml

This file is quite long, and contains multiple settings for different sections. Browse through the file, and enter the following configurations (replace the IPs with your node IPs):

#give your cluster a name.
cluster.name: my-cluster

#give your nodes a name (change node number from node to node).
node.name: "es-node-1"

#define node 1 as master-eligible:
node.master: true

#define nodes 2 and 3 as data nodes:
node.data: true

#enter the private IP and port of your node:
network.host: 172.11.61.27
http.port: 9200

#detail the private IPs of your nodes:
discovery.zen.ping.unicast.hosts: ["172.11.61.27", "172.31.22.131", "172.31.32.221"]

Save and exit.

Running your Elasticsearch cluster

You are now ready to start your Elasticsearch nodes and verify they are communicating with each other as a cluster.

For each instance, run the following command:

sudo service elasticsearch start

To enable elasticsearch on system startup

sudo systemctl enable elasticsearch

If everything was configured correctly, your Elasticsearch cluster should be up and running. To verify everything is working as expected, query Elasticsearch from any of the cluster nodes:

curl -XGET 'http://localhost:9200/_cluster/state?pretty'

The response should detail the cluster and its nodes:

{
  "cluster_name" : "my-cluster",
  "compressed_size_in_bytes" : 351,
  "version" : 4,
  "state_uuid" : "3LSnpinFQbCDHnsFv-Z8nw",
  "master_node" : "IwEK2o1-Ss6mtx50MripkA",
  "blocks" : { },
  "nodes" : {
    "IwEK2o1-Ss6mtx50MripkA" : {
      "name" : "es-node-2",
      "ephemeral_id" : "x9kUrr0yRh--3G0ckESsEA",
      "transport_address" : "172.31.50.123:9300",
      "attributes" : { }
    },
    "txM57a42Q0Ggayo4g7-pSg" : {
      "name" : "es-node-1",
      "ephemeral_id" : "Q370o4FLQ4yKPX4_rOIlYQ",
      "transport_address" : "172.31.62.172:9300",
      "attributes" : { }
    },
    "6YNZvQW6QYO-DX31uIvaBg" : {
      "name" : "es-node-3",
      "ephemeral_id" : "mH034-P0Sku6Vr1DXBOQ5A",
      "transport_address" : "172.31.52.220:9300",
      "attributes" : { }
    }
  },

Using NFS as data directory

Install packages for NFS client mounting

sudo apt install -y nfs-common

Creating Mount Points and Mounting Directories on the Client

Now that the host server is configured and serving its shares, we’ll prepare our client.

In order to make the remote shares available on the client, we need to mount the directories on the host that we want to share to empty directories on the client.

Note: If there are files and directories in your mount point, they will become hidden as soon as you mount the NFS share. To avoid the loss of important files, be sure that if you mount in a directory that already exists that the directory is empty.

We’ll create two directories for our mounts:

sudo mkdir -p /nfs/elasticsearch

Now that we have a location to put the remote shares and we’ve opened the firewall, we can mount the shares by addressing our host server, which in this docs is 203.0.113.0:

sudo mount 203.0.113.0:/var/nfs/elasticsearch /nfs/elasticsearch

These commands will mount the shares from the host computer onto the client machine. You can double-check that they mounted successfully in several ways. You can check this with a plain mount or findmnt command, but df -h provides a more easily readable output that illustrates how disk usage is displayed differently for the NFS shares:

df -h

Output
Filesystem Size Used Avail Use% Mounted on
udev 238M 0 238M 0% /dev
tmpfs 49M 628K 49M 2% /runv /dev/vda1 20G 1.2G 18G 7% /
tmpfs 245M 0 245M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 245M 0 245M 0% /sys/fs/cgroup
tmpfs 49M 0 49M 0% /run/user/0v 203.0.113.0:/var/nfs/elasticsearch 20G 1.2G 18G 7% /nfs/elasticsearch

Testing NFS Access

Next, let’s test access to the shares by writing something to each of them. Create a file in Elasticsearch

sudo touch /nfs/elasticsearch/elastic.test

Then look at the ownership of the file:

ls -l /nfs/elasticsearch/elastic.test

Output -rw-r--r-- 1 root root 0 May 28 13:32 /nfs/elasticsearch/elastic.test

Mounting the Remote NFS Directories at Boot

We can mount the remote NFS shares automatically at boot by adding them to /etc/fstab file on the client. Open this file with root privileges in your text editor:

sudo nano /etc/fstab

. . . 203.0.113.0:/var/nfs/elasticsearch /nfs/elasticsearch nfs auto,nofail,noatime,nolock,intr,tcp,actimeo=1800 0 0

Setting up Elasticsearch to use NFS directory

Open elasticsearch configuration file, for each node

sudo vim /etc/elasticsearch/elasticsearch.yml

change the data directory as follows using our above NFS mount

# Path to directory where to store index data allocated for this node.
path.data: /nfs/elasticsearch

restart elasticsearch service

sudo service elasticsearch restart