Restricting Hadoop Slave node storage using the LVM concept
4 min readSep 6, 2022
--
In this article, we will see how to contribute limited storage from data node(slave node) to Hadoop Cluster using the Logical Volume Partition concept of Linux.
Setup:
- On AWS, 2 instances of RedHat Linux Image launched named ‘Master’ and ‘Slave’. And, installed JDK and Hadoop softwares in both the instances.
- Attached an EBS volume of 1 GiB to the Slave node. Created Partition in the attached volume and mounted it to the Data-node directory.
- After the instances are launched to do remote login I used PuTTY software.
- Now after this we will create EBS volume of 1GiB by clicking on Volumes in AWS Console. I named it slave-volume. Notice here the 2 volumes of 10GiB are Root Volumes.
- Now, We will attach our slave-volume to our datanode instance running. It is just like we insert pendrive in our local system or we create a virtual harddisk and attach to it. Mainly we have to give the instance id only while attaching.
- Attaching the volume is so simple, we just need to select that volume first, click on Actions Button and after that we have to provide instance id.
- After attaching a volume to data-node we can see that volume is in in-use status.
- As we remotely logged in to the data-node instance previously, now we can see a similar Linux terminal, and to confirm that EBS volume we created or not we can use fdisk -l command.
- Now the main concept of Partitions comes into play. To share limited storage we are going to create a partition in that 1GiB volume attached. Here we created a partition of 512M(512 MiB). Here We can see the device /dev/xvdf1 created.
- Now we know we share the storage of Data-node to Name-Node to solve the problem of storage we have in BigData. So, For storing any data in this partition, we have to first Format it.
- To format that partition we created we need to run a command.
mkfs.ext4 /dev/xvdf1
- Now, We have to mount it on the same directory of data-node we will be using in Hadoop Cluster.
- To mount the partition on the desired directory, we have to run the following command:
mount /dev/xvdf1 /dn2
- After mounting we can confirm or see the size using df -hT command.
- Here all the steps of partitioning and mounting are completed.
- Now we have to configure the hdfs-site.xml file and core-site.xml file in Name-node and Data-node.
- Remember, In Namenode we have we have to format /nn directory we created and After that start the service of NameNode using hadoop-daemon.sh start namenode.
- Now we have to also configure the same files in Datanode also in the similar way, just remember to give IP of Name-node in core-site.xml file.
- Atlast, start the service of datanode also.
- Finally, Our all configurations are completed here.
- Now by command hadoop dfsadmin -report ,We can see the status of hadoop cluster. This Command can be run from any of the node.
- Finally,we have set the limitation to data-node storage size.
Thank you! keep learning! keep growing! keep sharing!
If you enjoyed this give it a clap, follow me on Medium for more
Let’s connect on LinkedIn