Posted on Leave a comment

Choose between Btrfs and LVM-ext4

Fedora 33 introduced a new default filesystem in desktop variants, Btrfs. After years of Fedora using ext4 on top of Logical Volume Manager (LVM) volumes, this is a big shift. Changing the default file system requires compelling reasons. While Btrfs is an exciting next-generation file system, ext4 on LVM is well established and stable. This guide aims to explore the high-level features of each and make it easier to choose between Btrfs and LVM-ext4.

In summary

The simplest advice is to stick with the defaults. A fresh Fedora 33 install defaults to Btrfs and upgrading a previous Fedora release continues to use whatever was initially installed, typically LVM-ext4. For an existing Fedora user, the cleanest way to get Btrfs is with a fresh install. However, a fresh install is much more disruptive than a simple upgrade. Unless there is a specific need, this disruption could be unnecessary. The Fedora development team carefully considered both defaults, so be confident with either choice.

What about all the other file systems?

There are a large number of file systems for Linux systems. The number explodes after adding in combinations of volume managers, encryption methods, and storage mechanisms . So why focus on Btrfs and LVM-ext4? For the Fedora audience these two setups are likely to be the most common. Ext4 on top of LVM became the default disk layout in Fedora 11, and ext3 on top of LVM came before that.

Now that Btrfs is the default for Fedora 33, the vast majority of existing users will be looking at whether they should stay where they are or make the jump forward. Faced with a fresh Fedora 33 install, experienced Linux users may wonder whether to use this new file system or fall back to what they are familiar with. So out of the wide field of possible storage options, many Fedora users will wonder how to choose between Btrfs and LVM-ext4.

Commonalities

Despite core differences between the two setups, Btrfs and LVM-ext4 actually have a lot in common. Both are mature and well-tested storage technologies. LVM has been in continuous use since the early days of Fedora Core and ext4 became the default in 2009 with Fedora 11. Btrfs merged into the mainline Linux kernel in 2009 and Facebook uses it widely. SUSE Linux Enterprise 12 made it the default in 2014. So there is plenty of production run time there as well.

Both systems do a great job preventing file system corruption due to unexpected power outages, even though the way they accomplish it is different. Supported configurations include single drive setups as well as spanning multiple devices, and both are capable of creating nearly instant snapshots. A variety of tools exist to help manage either system, both with the command line and graphical interfaces. Either solution works equally well on home desktops and on high-end servers.

Advantages of LVM-ext4

Show the relationship of LVM-ext4 filesystem to hard-drive partitions and mounted directories.
Structure of ext4 on LVM

The ext4 file system focuses on high-performance and scalability, without a lot of extra frills. It is effective at preventing fragmentation over extended periods of time and provides nice tools for when it does happen. Ext4 is rock solid because it built on the previous ext3 file system, bringing with it all the years of in-system testing and bug fixes.

Most of the advanced capabilities in the LVM-ext4 setup come from LVM itself. LVM sits “below” the file system, which means it supports any file system. Logical volumes (LV) are generic block devices so virtual machines can use them directly. This flexibility allows each logical volume to use the right file system, with the right options, for a variety of situations. This layered approach also honors the Unix philosophy of small tools working together.

The volume group (VG) abstraction from the hardware allows LVM to create flexible logical volumes. Each LV pulls from the same storage pool but has its own configuration. Resizing volumes is a lot easier than resizing physical partitions as there are no limitation of ordered placement of the data. LVM physical volumes (PV) can be any number of partitions and can even move between devices while the system is running.

LVM supports read-only and read-write snapshots, which make it easy to create consistent backups from active systems. Each snapshot has a defined size, and a change to the source or snapshot volume use space from there. Alternately, logical volumes can also be part of a thinly provisioned pool. This allows snapshots to automatically use data from a pool instead of consuming fixed sized chunks defined at volume creation.

Multiple devices with LVM

LVM really shines when there are multiple devices. It has native support for most RAID levels and each logical volume can have a different RAID level. LVM will automatically choose appropriate physical devices for the RAID configuration or the user can specify it directly. Basic RAID support includes data striping for performance (RAID0) and mirroring for redundancy (RAID1). Logical volumes can also use advanced setups like RAID5, RAID6, and RAID10. LVM RAID support is mature because under the hood LVM uses the same device-mapper (dm) and multiple-device (md) kernel support used by mdadm.

Logical volumes can also be cached volumes for systems with both fast and slow drives. A classic example is a combination of SSD and spinning-disk drives. Cached volumes use faster drives for more frequently accessed data (or as a write cache), and the slower drive for bulk data.

The large number of stable features in LVM and the reliable performance of ext4 are a testament to how long they have been in use. Of course, with more features comes complexity. It can be challenging to find the right options for the right feature when configuring LVM. For single drive desktop systems, features of LVM like RAID and cache volumes don’t apply. However, logical volumes are more flexible than physical partitions and snapshots are useful. For normal desktop use, the complexity of LVM can also be a barrier to recovering from issues a typical user might encounter.

Advantages of Btrfs

Show the relationship of Btrfs filesystem to hard-drive partitions and mounted directories.
Btrfs Structure

Lessons learned from previous generations guided the features built into Btrfs. Unlike ext4, it can directly span multiple devices, so it brings along features typically found only in volume managers. It also has features that are unique in the Linux file system space (ZFS has a similar feature set, but don’t expect it in the Linux kernel).

Key Btrfs features

Perhaps the most important feature is the checksumming of all data. Checksumming, along with copy-on-write, provides the key method of ensuring file system integrity after unexpected power loss. More uniquely, checksumming can detect errors in the data itself. Silent data corruption, sometimes referred to as bitrot, is more common that most people realize. Without active validation, corruption can end up propagating to all available backups. This leaves the user with no valid copies. By transparently checksumming all data, Btrfs is able to immediately detect any such corruption. Enabling the right dup or raid option allows the file system to transparently fix the corruption as well.

Copy-on-write (COW) is also a fundamental feature of Btrfs, as it is critical in providing file system integrity and instant subvolume snapshots. Snapshots automatically share underlying data when created from common subvolumes. Additionally, after-the-fact deduplication uses the same technology to eliminate identical data blocks. Individual files can use COW features by calling cp with the reflink option. Reflink copies are especially useful for copying large files, such as virtual machine images, that tend to have mostly identical data over time.

Btrfs supports spanning multiple devices with no volume manager required. Multiple device support unlocks data mirroring for redundancy and striping for performance. There is also experimental support for more advanced RAID levels, such as RAID5 and RAID6. Unlike standard RAID setups, the Btrfs raid1 option actually allows an odd number of devices. For example, it can use 3 devices, even if they are are different sizes.

All RAID and dup options are specified at the file system level. As a consequence, individual subvolumes cannot use different options. Note that using the RAID1 option with multiple devices means that all data in the volume is available even if one device fails and the checksum feature maintains the integrity of the data itself. That is beyond what current typical RAID setups can provide.

Additional features

Btrfs also enables quick and easy remote backups. Subvolume snapshots can be sent to a remote system for storage. By leveraging the inherent COW meta-data in the file system, these transfers are efficient by only sending incremental changes from previously sent snapshots. User applications such as snapper make it easy to manage these snapshots.

Additionally, a Btrfs volume can have transparent compression and chattr +c will mark individual files or directories for compression. Not only does compression reduce the space consumed by data, but it helps extend the life of SSDs by reducing the volume of write operations. Compression certainly introduces additional CPU overhead, but a lot of options are available to dial in the right trade-offs.

The integration of file system and volume manager functions by Btrfs means that overall maintenance is simpler than LVM-ext4. Certainly this integration comes with less flexibility, but for most desktop, and even server, setups it is more than sufficient.

Btrfs on LVM

Btrfs can convert an ext3/ext4 file system in place. In-place conversion means no data to copy out and then back in. The data blocks themselves are not even modified. As a result, one option for an existing LVM-ext4 systems is to leave LVM in place and simply convert ext4 over to Btrfs. While doable and supported, there are reasons why this isn’t the best option.

Some of the appeal of Btrfs is the easier management that comes with a file system integrated with a volume manager. By running on top of LVM, there is still some other volume manager in play for any system maintenance. Also, LVM setups typically have multiple fixed sized logical volumes with independent file systems. While Btrfs supports multiple volumes in a given computer, many of the nice features expect a single volume with multiple subvolumes. The user is still stuck manually managing fixed sized LVM volumes if each one has an independent Btrfs volume. Though, the ability to shrink mounted Btrfs filesystems does make working with fixed sized volumes less painful. With online shrink there is no need to boot a live image.

The physical locations of logical volumes must be carefully considered when using the multiple device support of Btrfs. To Btrfs, each LV is a separate physical device and if that is not actually the case, then certain data availability features might make the wrong decision. For example, using raid1 for data typically provides protection if a single drive fails. If the actual logical volumes are on the same physical device, then there is no redundancy.

If there is a strong need for some particular LVM feature, such as raw block devices or cached logical volumes, then running Btrfs on top of LVM makes sense. In this configuration, Btrfs still provides most of its advantages such as checksumming and easy sending of incremental snapshots. While LVM has some operational overhead when used, it is no more so with Btrfs than with any other file system.

Wrap up

When trying to choose between Btrfs and LVM-ext4 there is no single right answer. Each user has unique requirements, and the same user may have different systems with different needs. Take a look at the feature set of each configuration, and decide if there is something compelling about one over the other. If not, there is nothing wrong with sticking with the defaults. There are excellent reasons to choose either setup.

Posted on Leave a comment

Recover your files from Btrfs snapshots

As you have seen in a previous article, Btrfs snapshots are a convenient and fast way to make backups. Please note that these articles do not suggest that you avoid backup software or well-tested backup plans. Their goals are to show a great feature of this file system, snapshots, and to inspire curiosity and invite you to explore, experiment and deepen the subject. Read on for more about how to recover your files from Btrfs snapshots.

A subvolume for your project

Let’s assume that you want to save the documents related to a project inside the directory $HOME/Documents/myproject.

As you have seen, a Btrfs subvolume, as well as a snapshot, looks like a normal directory. Why not use a Btrfs subvolume for your project, in order to take advantage of snapshots? To create the subvolume, use this command:

btrfs subvolume create $HOME/Documents/myproject

You can create an hidden directory where to arrange your snapshots:

mkdir $HOME/.snapshots

As you can see, in this case there’s no need to use sudo. However, sudo is still needed to list the subvolumes, and to use the send and receive commands.

Now you can start writing your documents. Each day (or each hour, or even minute) you can take a snapshot just before you start to work:

btrfs subvolume snapshot -r $HOME/Documents/myproject $HOME/.snapshots/myproject-day1

For better security and consistency, and if you need to send the snapshot to an external drive as shown in the previous article, remember that the snapshot must be read only, using the -r flag.

Note that in this case, a snapshot of the /home subvolume will not snapshot the $HOME/Documents/myproject subvolume.

How to recover a file or a directory

In this example let’s assume a classic error: you deleted a file by mistake. You can recover it from the most recent snapshot, or recover an older version of the file from an older snapshot. Do you remember that a snapshot appears like a regular directory? You can simply use the cp command to restore the deleted file:

cp $HOME/.snapshots/myproject-day1/filename.odt $HOME/Documents/myproject

Or restore an entire directory:

cp -r $HOME/.snapshots/myproject-day1/directory $HOME/Documents/myproject

What if you delete the entire $HOME/Documents/myproject directory (actually, the subvolume)? You can recreate the subvolume as seen before, and again, you can simply use the cp command to restore the entire content from the snapshot:

btrfs subvolume create $HOME/Documents/myproject
cp -rT $HOME/.snapshots/myproject-day1 $HOME/Documents/myproject

Or you could restore the subvolume by using the btrfs snapshot command (yes, a snapshot of a snapshot):

btrfs subvolume snapshot $HOME/.snapshots/myproject-day1 $HOME/Documents/myproject

How to recover btrfs snapshots from an external drive

You can use the cp command even if the snapshot resides on an external drive. For instance:

cp /run/media/user/mydisk/bk/myproject-day1/filename.odt $HOME/Documents/myproject

You can restore an entire snapshot as well. In this case, since you will use the send and receive commands, you must use sudo. In addition, consider that the restored subvolume will be created as read only. Therefore you need to also set the read only property to false:

sudo btrfs send /run/media/user/mydisk/bk/myproject-day1 | sudo btrfs receive $HOME/Documents/
mv Documents/myproject-day1 Documents/myproject
btrfs property set Documents/myproject ro false

Here’s an extra explanation. The command btrfs subvolume snapshot will create an exact copy of a subvolume in the same device. The destination has to reside in the same btrfs device. You can’t use another device as the destination of the snapshot. In that case you need to take a snapshot and use the send and receive commands.

For more information, refer to some of the online documentation:

man btrfs-subvolume
man btrfs-send
man btrfs-receive
Posted on Leave a comment

Command line quick tips: More about permissions

A previous article covered some basics about file permissions on your Fedora system. This installment shows you additional ways to use permissions to manage file access and sharing. It also builds on the knowledge and examples in the previous article, so if you haven’t read that one, do check it out.

Symbolic and octal

In the previous article you saw how there are three distinct permission sets for a file. The user that owns the file has a set, members of the group that owns the file has a set, and then a final set is for everyone else. These permissions are expressed on screen in a long listing (ls -l) using symbolic mode.

Each set has r, w, and x entries for whether a particular user (owner, group member, or other) can read, write, or execute that file. But there’s another way to express these permissions: in octal mode.

You’re used to the decimal numbering system, which has ten distinct values (0 through 9). The octal system, on the other hand, has eight distinct values (0 through 7). In the case of permissions, octal is used as a shorthand to show the value of the r, w, and x fields. Think of each field as having a value:

  • r = 4
  • w = 2
  • x = 1

Now you can express any combination with a single octal value. For instance, read and write permission, but no execute permission, would have a value of 6. Read and execute permission only would have a value of 5. A file’s rwxr-xr-x symbolic permission has an octal value of 755.

You can use octal values to set file permissions with the chmod command similarly to symbolic values. The following two commands set the same permissions on a file:

chmod u=rw,g=r,o=r myfile1
chmod 644 myfile1

Special permission bits

There are several special permission bits also available on a file. These are called setuid (or suid), setgid (or sgid), and the sticky bit (or delete inhibit). Think of this as yet another set of octal values:

  • setuid = 4
  • setgid = 2
  • sticky = 1

The setuid bit is ignored unless the file is executable. If that’s the case, the file (presumably an app or a script) runs as if it were launched by the user who owns the file. A good example of setuid is the /bin/passwd utility, which allows a user to set or change passwords. This utility must be able to write to files no user should be allowed to change. Therefore it is carefully written, owned by the root user, and has a setuid bit so it can alter the password related files.

The setgid bit works similarly for executable files. The file will run with the permissions of the group that owns it. However, setgid also has an additional use for directories. If a file is created in a directory with setgid permission, the group owner for the file will be set to the group owner of the directory.

Finally, the sticky bit, while ignored for files, is useful for directories. The sticky bit set on a directory will prevent a user from deleting files in that directory owned by other users.

The way to set these bits with chmod in octal mode is to add a value prefix, such as 4755 to add setuid to an executable file. In symbolic mode, the u and g can be used to set or remove setuid and setgid, such as u+s,g+s. The sticky bit is set using o+t. (Other combinations, like o+s or u+t, are meaningless and ignored.)

Sharing and special permissions

Recall the example from the previous article concerning a finance team that needs to share files. As you can imagine, the special permission bits help to solve their problem even more effectively. The original solution simply made a directory the whole group could write to:

drwxrwx---. 2 root finance 4096 Jul 6 15:35 finance

One problem with this directory is that users dwayne and jill, who are both members of the finance group, can delete each other’s files. That’s not optimal for a shared space. It might be useful in some situations, but probably not when dealing with financial records!

Another problem is that files in this directory may not be truly shared, because they will be owned by the default groups of dwayne and jill — most likely the user private groups also named dwayne and jill.

A better way to solve this is to set both setgid and the sticky bit on the folder. This will do two things — cause files created in the folder to be owned by the finance group automatically, and prevent dwayne and jill from deleting each other’s files. Either of these commands will work:

sudo chmod 3770 finance
sudo chmod u+rwx,g+rwxs,o+t finance

The long listing for the file now shows the new special permissions applied. The sticky bit appears as T and not t because the folder is not searchable for users outside the finance group.

drwxrws--T. 2 root finance 4096 Jul 6 15:35 finance

Posted on Leave a comment

Convert file systems with Fstransform

Few people know that they can convert their filesystems from one type to another without losing data, i.e. non-destructively. It may sound like magic, but Fstransform can convert an ext2, ext3, ext4, jfs, reiserfs or xfs partition to another type from the list in almost any combination. More importantly, it does so in-place, without formatting or copying data anywhere. Atop of all this goodness, there is a little bonus: Fstransform can also handle ntfs, btrfs, fat and exfat partitions as well.

Before you run it

There are certain caveats and limitations in Fstransform, so it is strongly advised to back up before attempting a conversion. Additionally, there are some limitations to be aware of when using Fstransform:

  • Both the source and target filesystems must be supported by your Linux kernel. Sounds like an obvious thing and exposes zero risk in case you want to use ext2, ext3, ext4, reiserfs, jfs and xfs partitions. Fedora supports all of that just fine.
  • Upgrading ext2 to ext3 or ext4 does not require Fstransform. Use the Tune2fs utility instead.
  • The device with source file system must have at least 5% of free space.
  • You need to be able to unmount the source filesystem before you begin.
  • The more data your source file system stores, the longer the conversion will last. The actual speed depends on your device, but expect it to be around one gigabyte per minute. The large amount of hard links can also slow down the conversion.
  • Although Fstransform is proved to be stable, please back up data on your source filesystem.

Installation instructions

Fstransform is already a part of Fedora. Install with the command:

sudo dnf install fstransform

Time to convert something

Converting one file system to another in-place can take a while

The syntax of the fstransform command is very simple: fstransform <source device> <target file system>. Keep in mind that it needs root privileges to run, so don’t forget to add sudo in the beginning. Here goes an example:

sudo fstransform /dev/sdb1 ext4

Note that it is not possible to convert a root file system, which is a security measure. Use a test partition or an experimental thumb drive instead. In the meantime, Fstransform will through a lot of auxiliary output in the console. The most useful part is the estimated time of completion, which keep you informed about how long  the process will take. Again, few small files on an almost empty drive will make Fstransform do its job in a minute or so, whereas more real-world tasks may involve hours of wait time.

More file systems are supported

As mentioned above, it is possible to try Fstransform with ntfs, btrfs, fat and exfat partitions. These types are very experimental, and nobody can guarantee that the converion will flow perfect. Still, there are many success stories, and you can add your own by testing Fstransform with a sample data set on a test partition. Those additional file systems can be enabled by the use of the –force-untested-file-systems parameter:

sudo fstransform /dev/sdb1 ntfs --force-untested-file-systems

Sometimes the process may iterrupt with an error. Feel free to repeat the command again — it may eventually complete the conversion from second or third attempt.