Integrating Hadoop Cluster with LVM to provide Elasticity to Data node Storage

Priyanka Bhide
4 min readMar 24, 2021

What is LVM ?

Logical Volume Management or LVM provides a method of allocating space on mass-storage devices that is more flexible than conventional partitioning schemes to store volumes.

What is Elasticity ?

Elasticity is the concept which we can use to increase or decrease the volume of Hadoop Data Node. Hadoop data nodes shared storage can’t be static so LVM is used to make it dynamic.

Hadoop doesn’t support elasticity , So to make hadoop cluster elastic we use LVM.

Let’s get started with the task…

Step-1 : Set up the Master node and Data node , Attach one external Volume of 50GB for LVM management. Check the details using following command.

command : fdisk -l

Step-2 : Create Physical Volume(PV) of the attached storage.

Command : pvcreate /dev/sdb

To check the creation of PV use pvdisplay /dev/sdb command.

Step-3 : Create Volume Group(VG) , by which we can create a partition of 10GB from that VG.

command : vgcreate myvg1 /dev/sdb

To check the creation of VG use vgdisplay myvg1 command.

Step-4 : Now , Create Logical Volume(LV) partition of 10GB from the VG (myvg1)

Command : lvcreate — size 10G — name mylv1 myvg1

To check the creation of LV partiton use lvdisplay /dev/myvg1/mylv1 command.

Step-5 : Format the LV partition and then mount on the mount point. Format the LV partition by using mkfs.ext4 format method.

command : mkfs.ext4 /dev/myvg1/mylv1

Use fdisk -l command to see the formatted drive.

Write the following configuration in hdfs-site.xml file of hadoop :

Now , Mount the LV partition on the data node storage folder i.e/dn1 which is to be shared to the master node.

command : mount /dev/myvg1/mylv1 /dn1/

To check the mount point use df -h command.

Now , Check the master node ,

Command : hadoop dfsadmin -report

As we can see in above image Data Node is sharing 10GB (9.78GB) to the Master Node.

If our 10GB volume gets exhausted by storing the data, then we can increase the size of LVM on the fly as much as we want because we have used LVM concept.

Let’s increase the size of LVM from 10GB to 20GB.

Step-6 : To Increase the size of LV from the VG simply use lvextend.

command : lvextend — size +10G /dev/myvg1/mylv1

Step-7 : Format the 10GB Non-partition portion i.e. extra part from the complete LV using resize2fs tool.

Command : resize2fs /dev/myvg1/mylv1

Step-8 : Now, We have done all the things . Let’s check whether our datanode storage has been increased or not.

command : hadoop dfsadmin -report

Yes…It has been increased from 10GB to 20GB.✌🏻

Conclusion

We have completed the task of elasticity in hadoop . Successfully Increased the size of LV from 10GB to 20GB as per our requirement.

Thanks for Reading !!!

--

--