Integrate LVM with Hadoop and automate partition using Python-scriptđź”…

In this task we are gonna learn integrating LVM

Task description:

Elasticity Task Integrating LVM with Hadoop and providing Elasticity to DataNode Storage, Increase or Decrease the Size of Static Partition in Linux. Automating LVM Partition using Python-Script. Docker Task, Configuring HTTPD Server on Docker Container Setting up Python Interpreter and running Python Code on Docker Container

What is Hadoop?

Hadoop is an open-source framework which is quite popular in the big data industry. Due to hadoop’s future scope, versatility and functionality, it has become a must-have for every data scientist.

In simple words, Hadoop is a collection of tools that lets you store big data in a readily accessible and distributed environment. It enables you to process the data parallelly.

what is Hadoop cluster?

HDFS has a master/slave architecture. An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. In addition, there are a number of DataNodes, usually one per node in the cluster, which manage storage attached to the nodes that they run on. HDFS exposes a file system namespace and allows user data to be stored in files. Internally, a file is split into one or more blocks and these blocks are stored in a set of DataNodes. The NameNode executes file system namespace operations like opening, closing, and renaming files and directories. It also determines the mapping of blocks to DataNodes. The DataNodes are responsible for serving read and write requests from the file system’s clients. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode.

Understand Architecture: Apache Hadoop 3.0.0 — HDFS Architecture

LVM Storage Management Structures

LVM functions by layering abstractions on top of physical storage devices. The basic layers that LVM uses, starting with the most primitive, are.

  • Physical Volumes
  • Volume Groups
  • Logical Volumes

LVM can be used to combine physical volumes into volume groups to unify the storage space available on a system. Root volume group into arbitrary logical volumes, which act as flexible partitions.

prerequisite:

  1. we need to have hadoop latest version
  2. we should have java version
  3. linux cli to configure task

prerequisite:

  1. we need to have hadoop latest version
  2. we should have java version
  3. linux cli to configure task

we can integrate lvm with hadoop by giving the elasticity to datanode.

follow the steps below…

before integrating the cluster your system should provide hadoop and java.

check if your system has hadoop

$hadoop -version 

command: vim /etc/hadoop/hdfs-site.xml (using this command you need to update the file(namenode)

$vim /etc/hadoop/hdfs-site.xml

now use command: vim /etc/hadoop/core-site.xml(namenode)

$vim /etc/hadoop/core-site.xml

vim /etc/hadoop/hdfs-site.xml(datanode)

$vim /etc/hadoop/hdfs-site.xml

configure the core-site.xml(datanode)

$vim /etc/hadoop/core-site.xml

start datanode and namenode..jps helps if your system is connected to the node in your system or not.

$hadoop-deamon.sh start namenode
$hadoop-daemon.sh start datanode
$jps

check if the datanode and namenode is working then use dfsadmin to ckeck if you are connected to the namenode or not..


$hadoop dfsadmin -report

In your system add extra volume in your system

This is the added volume

$fdisk -l

To attach two volumes we must create physical volume individually

$pvcreate /dev/sdb
$pvdisplay /dev/sdb
$pvcreate /dev/sdc
$pvdisplay /dev/sdc

Attach the both volumes creating with new group

$vgcreate V_group /dev/sdb /dev/sdc
$vgdisplay V_group

create the logical volume size

$lvcreate --size 4G --name lvl V_group
$lvdisplay V_group/lvl

create the general filesystem and link with the new folder and mount the link file with filesystem

$pvcreate /dev/sdb
$mkfs.ext4 /dev/V_group/lvl
$mkdir /link
$mount /dev/V_group/lvl /link

This is the created file that was mounted (/dev/mapper/V_group-lvl)

$df -h

change the hdfs file and start again data node

$vim hdfs-site.xml
$hadoop-daemon.sh start datanode

use hadoop dfsadmin -report

$hadoop dfsadmin -report

To increase local volume size

$lvextend --size +1G /dev/V_group/lvl

now we have to format the extended part of 1GB so for this we use the command

$resize2fs /dev/V_group/lvl

now u can see below the increased size.

$lvdisplay V_group/lvl

now we have to format the extended part of 1GB so for this we use the command

$lvreduce -L 1G /dev/V_group/lvl

now again u can see the size how much datanode is contributing to namenode

$hadoop dfsadmin -report

Simillarly we can also reduce the logical volume size

$lvreduce -L 1G /dev/V_group/lvl

LV size is the extended size

$lvreduce V_group/lvl

now we have reduced 1GB

Automating LVM Partition using Python-script.

Kindly use the below git link get the file autodisk.py

continution…

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store