Restricting Hadoop Slave node storage using the LVM concept

Krithika Sharma
4 min readSep 6, 2022

--

In this article, we will see how to contribute limited storage from data node(slave node) to Hadoop Cluster using the Logical Volume Partition concept of Linux.

Source: internet

Setup:

  • On AWS, 2 instances of RedHat Linux Image launched named ‘Master’ and ‘Slave’. And, installed JDK and Hadoop softwares in both the instances.
  • Attached an EBS volume of 1 GiB to the Slave node. Created Partition in the attached volume and mounted it to the Data-node directory.
  • After the instances are launched to do remote login I used PuTTY software.
  • Now after this we will create EBS volume of 1GiB by clicking on Volumes in AWS Console. I named it slave-volume. Notice here the 2 volumes of 10GiB are Root Volumes.
  • Now, We will attach our slave-volume to our datanode instance running. It is just like we insert pendrive in our local system or we create a virtual harddisk and attach to it. Mainly we have to give the instance id only while attaching.
  • Attaching the volume is so simple, we just need to select that volume first, click on Actions Button and after that we have to provide instance id.
  • After attaching a volume to data-node we can see that volume is in in-use status.
  • As we remotely logged in to the data-node instance previously, now we can see a similar Linux terminal, and to confirm that EBS volume we created or not we can use fdisk -l command.
  • Now the main concept of Partitions comes into play. To share limited storage we are going to create a partition in that 1GiB volume attached. Here we created a partition of 512M(512 MiB). Here We can see the device /dev/xvdf1 created.
  • Now we know we share the storage of Data-node to Name-Node to solve the problem of storage we have in BigData. So, For storing any data in this partition, we have to first Format it.
  • To format that partition we created we need to run a command.
mkfs.ext4 /dev/xvdf1
  • Now, We have to mount it on the same directory of data-node we will be using in Hadoop Cluster.
  • To mount the partition on the desired directory, we have to run the following command:
mount /dev/xvdf1 /dn2
  • After mounting we can confirm or see the size using df -hT command.
  • Here all the steps of partitioning and mounting are completed.
  • Now we have to configure the hdfs-site.xml file and core-site.xml file in Name-node and Data-node.
  • Remember, In Namenode we have we have to format /nn directory we created and After that start the service of NameNode using hadoop-daemon.sh start namenode.
  • Now we have to also configure the same files in Datanode also in the similar way, just remember to give IP of Name-node in core-site.xml file.
  • Atlast, start the service of datanode also.
  • Finally, Our all configurations are completed here.
  • Now by command hadoop dfsadmin -report ,We can see the status of hadoop cluster. This Command can be run from any of the node.
  • Finally,we have set the limitation to data-node storage size.

Thank you! keep learning! keep growing! keep sharing!

If you enjoyed this give it a clap, follow me on Medium for more
Let’s connect on LinkedIn

--

--