Introduction to Logical Volume Manager with Oracle Linux

Contents

Introduction
Demonstration scenario
Installation
Creating partitions
Creating the PV’s
Creating and managing groups
Managing Logical Volumes
Building the file system
Increasing the volume capacity
Reducing the capacity
Adding physical drives to a group
Removing physical drives from a group
Final considerations

Introduction

Logical Volume Manager (LVM) is a Linux kernel mechanism for managing disk drives. Using LVM we can create logical volumes that span over multiple physical drives. We can expand (or shrink) the size of the volumes on the fly, having a very fine control over the disk capacity utilization. We can move the logical volumes between physical disks, provide redundancy by striping or mirroring the data, create snapshots and more.

In this article I will try to introduce LVM and show the basic commands used in a day-to-day management of logical volumes. I will be using a VMware Workstation virtual machine, running Oracle Enterprise Linux 5 Update 5. The virtual machine is configured with 5 hard drives. A system disk (20 GB) and 4 additional 50 GB disks, that we will manage with LVM.

My virtual machine’s configuration looks like this:

I added the 4 additional drives after the installation of Enterprise Linux and for the moment they are not initialized and have no partitions.

Demonstration scenario

Before we start the installation and configuration of LVM let’s discuss the steps that we will go trough. We have 5 SCSI disks at our disposal. The first one (/dev/sda) is managed in a traditional fashion. It is presented to the operating system as a standard physical drive and has three partitions defined – a boot partition (/dev/sda1, mounted as /boot), a swap partition (/dev/sda2) and a root partition (/dev/sda3, mounted as /). The file system for the root and boot partitions is deployed directly on top of the drive partitions.

There are some additional layers of abstraction that LVM introduces. LVM considers each drive partition (/dev/sdb1, /dev/sdc1 etc.) as a Physical Volume (PV). A set of physical volumes can be grouped in sets called Volume Groups (VG). In every Volume Group we can create multiple Logical Volumes and deploy the file system on top of them.

In this tutorial we will define a Volume Group called database. The group will store files for an Oracle database. Database will contain two Logical Volumes – /u01/oradata (that will hold the datafiles) and /u01/backup (that will hold database archives).

Part of the flexibility provided by LVM comes from the fact that we can add or remove Physcal Volumes on the fly. This allows for easy expansion (or reduction) of the group’s capacity. If the group’s free space is not fully allocated we can also dynamically resize any of its Logical Volumes.

Installing LVM

In order to use LVM we need two things – a specific module in the Linux kernel and a set of user commands from the lvm2 package. Enterprise Linux 5 comes with LVM support by default, so there isn’t much to configure. We will, however, verify that everything is in place.

There is no dedicated LVM kernel module. The user commands interact with the OS kernel through device-mapper. As long as we have it, we should be able to use LVM. A quick way to check if device-mapper is in place is to take a look inside /proc/misc.

[root@el5 ~]# cat /proc/misc |grep device-mapper
 63 device-mapper
[root@el5 ~]# 

In case device-mapper is missing, we should enable it in the kernel by setting the following option:

Device Drivers --> Multi-device support (RAID and LVM)

    [*] Multiple devices driver support (RAID and LVM)
    < >   RAID support
    <*>   Device mapper support
    < >     Crypt target support (NEW)

The lvm2 package provides the user tools. Let’s confirm that it is installed.

[root@el5 ~]# rpm -q lvm2
lvm2-2.02.56-8.el5
[root@el5 ~]# 

This is pretty much everything we need.

Creating partitions

As we already noted, the PV’s are defined on top of partitions. Let’s start by looking at the current partition table.

[root@el5 ~]# fdisk -l

Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          13      104391   83  Linux
/dev/sda2              14         274     2096482+  82  Linux swap / Solaris
/dev/sda3             275        2610    18763920   83  Linux

Disk /dev/sdb: 53.6 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdb doesn't contain a valid partition table

Disk /dev/sdc: 53.6 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 53.6 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdd doesn't contain a valid partition table

Disk /dev/sde: 53.6 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sde doesn't contain a valid partition table
[root@el5 ~]# 

The command that creates Physcal Volumes is called pvcreate. We can run it against a partition (pvcreate /dev/sdb1) or directly against physical drive (pvcreate /dev/sdb). The second option seems easier, because we can skip the partition definition. You should, however, avoid using it. There is a possibility that other partitioning tools and operating systems might not recognize that the disk is being used by LVM, if there is no valid partition on it. Such tools might also try to initialize the disk, which will be fatal for the data. To encourage good practices we will create a single partition on all four drives, that we plan to use with LVM. Let’s start with /dev/sdb.

[root@el5 ~]# fdisk /dev/sdb
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.


The number of cylinders for this disk is set to 6527.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n 
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-6527, default 1): 
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-6527, default 6527): 

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
[root@el5 ~]# 

We have to repeat the same steps for /dev/sdc, /dev/sdd and /dev/sde. After all partitions are in place, the partition table should look like this:

[root@el5 ~]# fdisk -l

Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          13      104391   83  Linux
/dev/sda2              14         274     2096482+  82  Linux swap / Solaris
/dev/sda3             275        2610    18763920   83  Linux

Disk /dev/sdb: 53.6 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1        6527    52428096   83  Linux

Disk /dev/sdc: 53.6 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1        6527    52428096   83  Linux

Disk /dev/sdd: 53.6 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1        6527    52428096   83  Linux

Disk /dev/sde: 53.6 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1        6527    52428096   83  Linux
[root@el5 ~]# 

Creating the PV’s

We run the pvcreate command to create four Physical Volumes.

[root@el5 ~]# pvcreate /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
  Physical volume "/dev/sdb1" successfully created
  Physical volume "/dev/sdc1" successfully created
  Physical volume "/dev/sdd1" successfully created
  Physical volume "/dev/sde1" successfully created
[root@el5 ~]# 

To display information about the recently created volumes we can use the pvdisplay command. We can pass a specific partition as an argument. If the command is run without an option it will display information about all PV’s.

[root@el5 ~]# pvdisplay
  "/dev/sdb1" is a new physical volume of "50.00 GB"
  --- NEW Physical volume ---
  PV Name               /dev/sdb1
  VG Name               
  PV Size               50.00 GB
  Allocatable           NO
  PE Size (KByte)       0
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               NhWx3q-i61M-8lLn-JxZ1-vJV9-1AMD-9nHnC4
   
  "/dev/sdc1" is a new physical volume of "50.00 GB"
  --- NEW Physical volume ---
  PV Name               /dev/sdc1
  VG Name               
  PV Size               50.00 GB
  Allocatable           NO
  PE Size (KByte)       0
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               gDnsvR-d190-1H5C-p2Vt-S9Xj-mSUn-gA0llg
   
  "/dev/sdd1" is a new physical volume of "50.00 GB"
  --- NEW Physical volume ---
  PV Name               /dev/sdd1
  VG Name               
  PV Size               50.00 GB
  Allocatable           NO
  PE Size (KByte)       0
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               i8vKSA-dKoW-7n28-K9fu-ASzm-D3A8-pILdDK
   
  "/dev/sde1" is a new physical volume of "50.00 GB"
  --- NEW Physical volume ---
  PV Name               /dev/sde1
  VG Name               
  PV Size               50.00 GB
  Allocatable           NO
  PE Size (KByte)       0
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               RF37f3-SfBR-aozU-7l01-Rqh9-4YV2-1o7GdK
   
[root@el5 ~]#

Creating and managing groups

Next we will run the vgcreate command. This command is used for creating new volume groups and it accepts a name for the group and a list of PV’s that the group will contain.

Let’s use all four PV’s and create a group called database.

[root@el5 ~]# vgcreate datbase /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
  Volume group "datbase" successfully created
[root@el5 ~]# 

We can use the vgdisplay command to get information about the newly created group. As with pvdisplay we can optionally provide a VG name. Running the command without arguments will display information on all available volume groups.

[root@el5 ~]# vgdisplay
  --- Volume group ---
  VG Name               datbase
  System ID             
  Format                lvm2
  Metadata Areas        4
  Metadata Sequence No  1
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                0
  Open LV               0
  Max PV                0
  Cur PV                4
  Act PV                4
  VG Size               199.98 GB
  PE Size               4.00 MB
  Total PE              51196
  Alloc PE / Size       0 / 0   
  Free  PE / Size       51196 / 199.98 GB
  VG UUID               te0yjU-2ywi-V9FO-NcQ7-UqVh-Opbv-tnzy6R
   
[root@el5 ~]# 

The output from vgdisplay shows an error in the group’s name (datbase instead of database). I did this on purpose so I can introduce another command – vgrename. As its name suggests, vgrename is used for renaming volume groups. Let’s see it in action.

[root@el5 ~]# vgrename datbase database
  Volume group "datbase" successfully renamed to "database"
[root@el5 ~]# 

If we run the vgdisplay command again, the group’s name should be correct.

[root@el5 ~]# vgdisplay
  --- Volume group ---
  VG Name               database
  System ID             
  Format                lvm2
  Metadata Areas        4
  Metadata Sequence No  2
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                0
  Open LV               0
  Max PV                0
  Cur PV                4
  Act PV                4
  VG Size               199.98 GB
  PE Size               4.00 MB
  Total PE              51196
  Alloc PE / Size       0 / 0   
  Free  PE / Size       51196 / 199.98 GB
  VG UUID               te0yjU-2ywi-V9FO-NcQ7-UqVh-Opbv-tnzy6R
   
[root@el5 ~]# 

There are two other commands that we should be aware of – vgremove and vgscan. The first one is used for deleting a volume group and the second one scans all available hard drives for logical volumes and groups.

Let’s see how vgremove works. We can run it and delete the database group.

[root@el5 ~]# vgremove database
  Volume group "database" successfully removed
[root@el5 ~]# 

After vgremove is done, the output of vgdisplay will show no groups at all.

[root@el5 ~]# vgdisplay
[root@el5 ~]# 

Let’s recreate the database group.

[root@el5 ~]# vgcreate database /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
  Volume group "database" successfully created
[root@el5 ~]# 

After the group is created we can run vgscan. This is not a mandatory step. The group is already registered and visible trough vgdisplay. We are running vgscan just to get familiar with its usage.

[root@el5 ~]# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "database" using metadata type lvm2
[root@el5 ~]# 

You can see that vgscan checked all physical volumes and the only available group that it discovered is database. This is correct. Database is the only VG defined on our system.

Managing Logical Volumes

Our next step is to create the logical volumes. We will define two of them – oradata (20 GB) and backup (25 GB). Creating LV’s is done via the lvcreate command. We should provide three arguments to it – logical volume name, capacity and volume group name. The last one determines in which volume group the LV will reside.

[root@el5 ~]# lvcreate --name oradata --size 20GB database
  Logical volume "oradata" created
[root@el5 ~]# 

We create backup in a similar fashion.

[root@el5 ~]# lvcreate --name backup --size 25GB database
  Logical volume "backup" created
[root@el5 ~]# 

There is a dedicated command for displaying information about logical volumes, that is similar to the vgdisplay. This command is called lvdisplay.

[root@el5 ~]# lvdisplay
  --- Logical volume ---
  LV Name                /dev/database/oradata
  VG Name                database
  LV UUID                o7vLVy-yBNs-4TBb-8rq6-48DM-Rg43-QD0Qy5
  LV Write Access        read/write
  LV Status              available
  # open                 0
  LV Size                20.00 GB
  Current LE             5120
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
  --- Logical volume ---
  LV Name                /dev/database/backup
  VG Name                database
  LV UUID                ya6xAf-T21Z-I0XV-LeUo-k0wR-VVee-Dbkksn
  LV Write Access        read/write
  LV Status              available
  # open                 0
  LV Size                25.00 GB
  Current LE             6400
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1
   
[root@el5 ~]# 

The lvdisplay command reveals that both logical volumes are correctly created. Let’s create a third LV. We will name it test and set its capacity to 3 GB.

[root@el5 ~]# lvcreate --name test --size 3GB database
  Logical volume "test" created
[root@el5 ~]# 

We will use the test logical volume to demonstrate two very important commands – lvextend and lvreduce. These commands are used for increasing and reducing a logical volume’s capacity. Assume we want to increase the capacity of test from 3 to 10 GB. When using lvextend we can set the new capacity either by providing the number of logical extents to be added (the -l option) or by providing the total desired capacity of the volume in MB, GB etc. (through the -L option). You should note that lvextends, like most of the LV commands, requires that we provide both name and group of the LV.

[root@el5 ~]# lvextend -L10GB database/test
  Extending logical volume test to 10.00 GB
  Logical volume test successfully resized
[root@el5 ~]# 

Let’s run lvdisplay and see if the change is implemented.

[root@el5 ~]# lvdisplay database/test
  --- Logical volume ---
  LV Name                /dev/database/test
  VG Name                database
  LV UUID                6oEg37-9RU4-qZY7-0Nfh-3X4H-LlvG-H8v3W8
  LV Write Access        read/write
  LV Status              available
  # open                 0
  LV Size                10.00 GB
  Current LE             2560
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2
   
[root@el5 ~]# 

It looks like the new capacity is correctly recognized. If we ever do some changes that are not immediately reflected, we can run the lvscan command to rescan the LVM block devices. This is similar to running vgscan for volume groups.

[root@el5 ~]# lvscan
  ACTIVE            '/dev/database/oradata' [20.00 GB] inherit
  ACTIVE            '/dev/database/backup' [25.00 GB] inherit
  ACTIVE            '/dev/database/test' [10.00 GB] inherit
[root@el5 ~]# 

Now let’s rename the test LV to small and shrink its capacity to 1 GB. We will use the lvrename and lvreduce commands.

[root@el5 ~]# lvrename database/test small
  Renamed "test" to "small" in volume group "database"
[root@el5 ~]# 

We proceed with shrinking small.

[root@el5 ~]# lvreduce -L1GB database/small
  WARNING: Reducing active logical volume to 1.00 GB
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce small? [y/n]: y
  Reducing logical volume small to 1.00 GB
  Logical volume small successfully resized
[root@el5 ~]# 

Let’s see what database/small looks like.

[root@el5 ~]# lvdisplay database/small
  --- Logical volume ---
  LV Name                /dev/database/small
  VG Name                database
  LV UUID                6oEg37-9RU4-qZY7-0Nfh-3X4H-LlvG-H8v3W8
  LV Write Access        read/write
  LV Status              available
  # open                 0
  LV Size                1.00 GB
  Current LE             256
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2
   
[root@el5 ~]# 

The capacity is reduced to 1 GB as expected.

We will no longer need this LV and we can delete it using lvremove.

[root@el5 ~]# lvremove database/small
Do you really want to remove active logical volume small? [y/n]: y
  Logical volume "small" successfully removed
[root@el5 ~]# 

Let’s run lvscan one more time and confirm that all logical volumes that we currently have are oradata and orabackup.

[root@el5 ~]# lvscan
  ACTIVE            '/dev/database/oradata' [20.00 GB] inherit
  ACTIVE            '/dev/database/backup' [25.00 GB] inherit
[root@el5 ~]#

Building the file system

Our next task is to build the file system of oradata and orabackup. LVM does not imply any restrictions regarding the file system type. We could use ext2, ext3, xfs, reiserfs and so on.

The Oracle Database, however, strictly specifies what kind of file systems are supported for storing database files. Since we are going to use oradata for storing such files we have to comply with these requirements. Oracle issued a dedicate document that lists file system types that are supported by Oracle Database on Enterprise Linux (Enterprise Linux: Linux, Filesystem & I/O Type Supportability, Metalink ID 279069.1). According to this document we have to stick to ext3, ocfs2 or NFS. We will therefore use ext3 for the oradata logical volume.

On the other hand there are no restrictions for the volume that will store our backups. In order to illustrate the usage of something different than ext3, we will use xfs as a file system type for the backup LV. In fact xfs is superior than ext3 for storing large files, which makes our decision to use it for storing archives quite appropriate. You should know, however, that Enterprise Linux 5 does not include xfs support by default. In order to follow the xfs examples you will have to manually configure xfs support in advance.

Building an ext3 file system on top of oradata is done via the mkfs.ext3 command.

[root@el5 ~]# mkfs.ext3 /dev/database/oradata
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
2621440 inodes, 5242880 blocks
262144 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
160 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
        4096000

Writing inode tables: done                            
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 32 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
[root@el5 ~]# 

In order to build an xfs file system for backup we use mkfs.xfs.

[root@el5 ~]# mkfs.xfs /dev/database/backup
meta-data=/dev/database/backup   isize=256    agcount=16, agsize=409600 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=6553600, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096  
log      =internal log           bsize=4096   blocks=3200, version=1
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@el5 ~]# 

Let’s proceed with mounting the volumes. We will use /u01/oradata as a mount point for oradata and /u01/backup as a mount point for backup.

[root@el5 ~]# mkdir -p /u01/oradata
[root@el5 ~]# mkdir /u01/backup
[root@el5 ~]# mount /dev/database/oradata /u01/oradata/
[root@el5 ~]# mount /dev/database/backup /u01/backup/

Run df in order to see the available capacity of the two volumes.

[root@el5 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3              18G  3.0G   14G  19% /
/dev/sda1              99M   12M   83M  12% /boot
tmpfs                 506M     0  506M   0% /dev/shm
/dev/mapper/database-oradata
                       20G  173M   19G   1% /u01/oradata
/dev/mapper/database-backup
                       25G  4.6M   25G   1% /u01/backup
[root@el5 ~]# 

As an additional step we can add information about the two volumes in /etc/fstab. This will allow the system to mount them automatically during boot. Here is how the changed /etc/fstab should look like:

LABEL=/                 /                       ext3    defaults        1 1
LABEL=/boot             /boot                   ext3    defaults        1 2
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
LABEL=SWAP-sda2         swap                    swap    defaults        0 0
/dev/database/oradata   /u01/oradata            ext3    rw,noatime      0 0
/dev/database/backup    /u01/backup             xfs     rw,noatime      0 0

Increasing the volume capacity

Our next goal is to learn how to control the volume capacity, keeping the file system and potential data intact. We begin with oradata and the ext3 file system. Let’s check the current capacity via df.

[root@el5 ~]# df -h /u01/oradata
Filesystem            Size  Used Avail Use% Mounted on /dev/mapper/database-oradata
                       20G  173M   19G   1% /u01/oradata
[root@el5 ~]# 

A major drawback of ext3 is that we can not change a volume’s capacity, while the volume is still in use. We have to unmount it first.

[root@el5 ~]# umount /u01/oradata/
[root@el5 ~]# 

We can then use the lvextend command and increase the capacity by 5 GB.

[root@el5 ~]# lvextend -L+5GB database/oradata
  Extending logical volume oradata to 25.00 GB
  Logical volume oradata successfully resized
[root@el5 ~]# 

Note the value of the -L parameter. The plus sign instructs lvextend to add the provided value to the current capacity. Another way of doing the same is simply to set the total desired capacity (-L25GB).

After the new capacity is set we have to resize the file system as well. It is always a good idea to check the file system for errors, before attempting more complex operations. We will use the e2fsck command, which by the way also requires the volume to be unmounted in order to run.

[root@el5 ~]# e2fsck -f /dev/database/oradata
e2fsck 1.39 (29-May-2006)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/database/oradata: 11/2621440 files (9.1% non-contiguous), 126323/5242880 blocks
[root@el5 ~]# 

Next we use the resize2fs command, that performs the actual resizing.

[root@el5 ~]# resize2fs /dev/database/oradata
resize2fs 1.39 (29-May-2006)
Resizing the filesystem on /dev/database/oradata to 6553600 (4k) blocks.
The filesystem on /dev/database/oradata is now 6553600 blocks long.

[root@el5 ~]# 

Let’s mount the volume back.

[root@el5 ~]# mount /u01/oradata/
[root@el5 ~]#

Checking its capacity reveals a total of 25GB.

[root@el5 ~]# df -h /u01/oradata/
Filesystem            Size  Used Avail Use% Mounted on /dev/mapper/database-oradata
                       25G  173M   24G   1% /u01/oradata
[root@el5 ~]# 

With xfs it is even easier, as we do not have to unmount the volume in order to increase its capacity. We can simply use lvextend.

[root@el5 ~]# lvextend -L+5GB database/backup
  Extending logical volume backup to 30.00 GB
  Logical volume backup successfully resized
[root@el5 ~]# 

The xfs_growfs command is used to reflect the capacity change at the file system level. The only parameter xfs_growfs needs is the mount point for the volume.

[root@el5 ~]# xfs_growfs /u01/backup
meta-data=/dev/database/backup   isize=256    agcount=16, agsize=409600 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=6553600, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096  
log      =internal               bsize=4096   blocks=3200, version=1
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 6553600 to 7864320
[root@el5 ~]# 

The change goes into effect immediately.

[root@el5 ~]# df -h /u01/backup
Filesystem            Size  Used Avail Use% Mounted on /dev/mapper/database-backup
                       30G  4.7M   30G   1% /u01/backup
[root@el5 ~]# 

Reducing the capacity

Now that we are familiar with the procedure for increasing a volume’s capacity, let’s learn how to do the opposite. We will start by resizing the file system and then we use lvreduce to resize the volume itself.

Let’s start by unmounting oradata.

[root@el5 ~]# umount /u01/oradata/
[root@el5 ~]# 

We check the file system for errors.

[root@el5 ~]# e2fsck -f /dev/database/oradata
e2fsck 1.39 (29-May-2006)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/database/oradata: 11/3276800 files (9.1% non-contiguous), 146883/6553600 blocks
[root@el5 ~]# 

When using resize2fs to shrink a volume, we can set the new capacity in two different ways. We can either provide the new capacity as a size in kilobytes, megabytes, or gigabytes (i.e. 20K, 20M, 20G) or we can simply set the new filesystem blocksize.

We will use the latter option, so we can bring oradata to its original capacity more precisely. Before extending the volume with 5 GB we ran e2fsck. The command reported the original filesystem blocksize (5242880) and we can use this number with resize2fs in order to shrink oradata back.

[root@el5 ~]# resize2fs /dev/database/oradata 5242880
resize2fs 1.39 (29-May-2006)
Resizing the filesystem on /dev/database/oradata to 5242880 (4k) blocks.
The filesystem on /dev/database/oradata is now 5242880 blocks long.

[root@el5 ~]# 

Now is the time to use lvreduce and to shrink the LV. We should not be worried by the data loss warning, as we already shrunk the file system to the appropriate size.

[root@el5 ~]# lvreduce -L20G database/oradata
  WARNING: Reducing active logical volume to 20.00 GB
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce oradata? [y/n]: y
  Reducing logical volume oradata to 20.00 GB
  Logical volume oradata successfully resized
[root@el5 ~]# 

Let’s run a file system check.

[root@el5 ~]# e2fsck -f /dev/database/oradata
e2fsck 1.39 (29-May-2006)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/database/oradata: 11/2621440 files (9.1% non-contiguous), 126323/5242880 blocks
[root@el5 ~]# 

If we mount oradata and check its capacity we will see that it has been reduced back to 20GB.

[root@el5 ~]# mount /u01/oradata/
[root@el5 ~]# df -h /u01/oradata/
Filesystem            Size  Used Avail Use% Mounted on /dev/mapper/database-oradata
                       20G  173M   19G   1% /u01/oradata
[root@el5 ~]# 

You should note that there is no easy way to shrink an xfs file system. If you need to perform such operation you have to dump the entire volume, rebuild it and then restore the data back from the dump file.

Adding physical drives to a group

One of the tasks that you will face, sooner or later, is to add new hard drives to a server and reconfiguring some LVM groups to utilize the newly provided space. To illustrate the steps that you have to perform I will add a new disk to the virtual machine and extend the database group over it.

Let’s start by adding another virtual disk. Turn off the virtual machine and add a new hard disk with a capacity of 50 GB. When the new disk is in place we should have a total of 6 hard drives configured.

Power on the machine and check that the new drive is visible. It should appear as /dev/sdf.

We should proceed by creating a partition, as we did for the other LVM member disks.

[root@el5 ~]# fdisk /dev/sdf
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.


The number of cylinders for this disk is set to 6527.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-6527, default 1): 
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-6527, default 6527): 

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
[root@el5 ~]# 

The PV comes next.

[root@el5 ~]# pvcreate /dev/sdf1
  Physical volume "/dev/sdf1" successfully created
[root@el5 ~]# 

Adding the new partition to the volume group is done via vgextend. We have to provide the VG name and the partition as arguments to the command.

[root@el5 ~]# vgextend database /dev/sdf1
  Volume group "database" successfully extended
[root@el5 ~]# 

Let’s see the information for the database group.

[root@el5 ~]# vgdisplay
  --- Volume group ---
  VG Name               database
  System ID             
  Format                lvm2
  Metadata Areas        5
  Metadata Sequence No  12
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                5
  Act PV                5
  VG Size               249.98 GB
  PE Size               4.00 MB
  Total PE              63995
  Alloc PE / Size       12800 / 50.00 GB
  Free  PE / Size       51195 / 199.98 GB
  VG UUID               DcuO3X-HFRV-ZVyx-MVlD-gY3c-jyo0-FwR3uO
   
[root@el5 ~]# 

Note that the group’s capacity is increased to 250 GB and that it consists of 5 physical volumes.

Removing physical drives from a group

We already know how to add and configure a new disk to our server. Let see how to remove one from it. We might have to perform this operation if some of the physical drives starts experiencing errors. For illustrating the necessary steps I will remove the /dev/sdb drive.

Our first task is to evict any data residing on /dev/sdb. We can move all data that is located on /dev/sdb to the newly added /dev/sdf. We can do this by executing the pvmove command.

[root@el5 ~]# pvmove /dev/sdb1 /dev/sdf1
  /dev/sdb1: Moved: 26.2%
  /dev/sdb1: Moved: 46.7%
  /dev/sdb1: Moved: 67.3%
  /dev/sdb1: Moved: 87.8%
  /dev/sdb1: Moved: 100.0%
[root@el5 ~]# 

After all data is moved from /dev/sdb we can use the vgreduce command and exclude the /dev/sdb1 partition from the database group.

[root@el5 ~]# vgreduce database /dev/sdb1
  Removed "/dev/sdb1" from volume group "database"
[root@el5 ~]# 

If we take a look at the disk group via vgdisplay we can notice that it consists of only 4 volumes.

[root@el5 ~]# vgdisplay
  --- Volume group ---
  VG Name               database
  System ID             
  Format                lvm2
  Metadata Areas        4
  Metadata Sequence No  16
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                4
  Act PV                4
  VG Size               199.98 GB
  PE Size               4.00 MB
  Total PE              51196
  Alloc PE / Size       12800 / 50.00 GB
  Free  PE / Size       38396 / 149.98 GB
  VG UUID               DcuO3X-HFRV-ZVyx-MVlD-gY3c-jyo0-FwR3uO
   
[root@el5 ~]# 

As a final step we can delete the physical volume, so it does not appear in the pvdisplay‘s output.

[root@el5 ~]# pvremove /dev/sdb1
  Labels on physical volume "/dev/sdb1" successfully wiped
[root@el5 ~]# 

We can now safely turn off the machine and remove the second virtual disk (Hard Disk 2). After successful removal of one disk drive the machine should look like this:

After a the machine is powered back we can take a look at the output of pvdisplay.

[root@el5 ~]# pvdisplay
  --- Physical volume ---
  PV Name               /dev/sdb1
  VG Name               database
  PV Size               50.00 GB / not usable 3.31 MB
  Allocatable           yes 
  PE Size (KByte)       4096
  Total PE              12799
  Free PE               5119
  Allocated PE          7680
  PV UUID               gDnsvR-d190-1H5C-p2Vt-S9Xj-mSUn-gA0llg
   
  --- Physical volume ---
  PV Name               /dev/sdc1
  VG Name               database
  PV Size               50.00 GB / not usable 3.31 MB
  Allocatable           yes 
  PE Size (KByte)       4096
  Total PE              12799
  Free PE               12799
  Allocated PE          0
  PV UUID               i8vKSA-dKoW-7n28-K9fu-ASzm-D3A8-pILdDK
   
  --- Physical volume ---
  PV Name               /dev/sdd1
  VG Name               database
  PV Size               50.00 GB / not usable 3.31 MB
  Allocatable           yes 
  PE Size (KByte)       4096
  Total PE              12799
  Free PE               12799
  Allocated PE          0
  PV UUID               RF37f3-SfBR-aozU-7l01-Rqh9-4YV2-1o7GdK
   
  --- Physical volume ---
  PV Name               /dev/sde1
  VG Name               database
  PV Size               50.00 GB / not usable 3.31 MB
  Allocatable           yes 
  PE Size (KByte)       4096
  Total PE              12799
  Free PE               7679
  Allocated PE          5120
  PV UUID               VwoLJW-DhQB-9YKz-2fy5-ddwI-yLsA-oc9lnD
   
[root@el5 ~]# 

We shouldn’t worry that /dev/sdb1 is listed above. When the physical drive disappeared all drive names shifted by a letter, so that /dev/sdc became /dev/sdb, /dev/sdd became /dev/sdc and so on. Because of this we no longer see /dev/sdf. Actually LVM recognizes the drives by a unique identifier (PV UUID) and not by their names under /dev. You can easily check that /dev/sdb1‘s unique identifier (gDnsvR-d190-1H5C-p2Vt-S9Xj-mSUn-gA0llg) is actually the one of /dev/sdc1 from the previous pvdisplay run (before taking the second virtual disk out).

Let’s look at the database group.

[root@el5 ~]# vgdisplay
  --- Volume group ---
  VG Name               database
  System ID             
  Format                lvm2
  Metadata Areas        4
  Metadata Sequence No  16
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                4
  Act PV                4
  VG Size               199.98 GB
  PE Size               4.00 MB
  Total PE              51196
  Alloc PE / Size       12800 / 50.00 GB
  Free  PE / Size       38396 / 149.98 GB
  VG UUID               DcuO3X-HFRV-ZVyx-MVlD-gY3c-jyo0-FwR3uO
   
[root@el5 ~]# 

There are only 4 physical volumes in it, as expected.

Final considerations

It is generally not a good idea to fully occupy the group capacity. A better approach is to plan and allocate the individual space requirements for every LV, leaving the rest of the group’s capacity unallocated. This will let us have some free space that we can allocate on demand at a later stage. Keep in mind that as a general rule it is much easier to extend than to shrink.

Most GNU/Linux expert advise against using LVM for your root file system. Although it is tempting to have all the LVM’s advantages for our entire server, a recovery of a root file system residing inside LVM is much more harder (if not impossible, if the LVM itself gets damaged).

There are many disputes regarding LVM vs. Oracle ASM. My advise about it is simple – if you are going to use RAC, go for ASM. If you are running a single instance go for LVM. You can find different performance studies showing LVM as 10-15% faster [1]. If the database is not shared between different nodes, LVM looks like the better option.

If you are interested in combining LVM and software RAID, check out the second part of this tutorial – Adding software RAID to LVM configurations

[1] Bert Scalzo – Optimizing Oracle 10g on Linux: Non-RAC ASM vs. LVM (Linux Journal, August 31, 2005).