# Hard drives made easy
by Seth Kenlon

On most computer systems, Linux or otherwise, when you plug a USB thumb drive in, you're somehow alerted that the drive exists. If the drive is already partitioned and formatted to your liking, all you need your computer to do is list the drive somewhere in your file manager window or on your desktop. It's a simple requirement, and one that's generally fulfilled.

Sometimes, however, a drive isn't set up the way you want it to be set up. For those times, you need to know how to find and prepare a storage device connected to your machine.

## Block devices

A hard drive is generically referred to as a "block device", because hard drives read and write data in fixed-size blocks. This differentiates a hard drive from anything else you might plug into your computer, like a printer, or gamepad, microphone, or camera. The easy way to list the block devices attached to your Linux system is the **lsblk** (list block devices) command:

    $ lsblk
    NAME                  MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
    sda                    8:0    0 238.5G  0 disk  
    ├─sda1                 8:1    0     1G  0 part  /boot
    └─sda2                 8:2    0 237.5G  0 part  
      └─luks-e2bb...e9f8 253:0    0 237.5G  0 crypt 
	├─fedora-root    253:1    0    50G  0 lvm   /
	├─fedora-swap    253:2    0   5.8G  0 lvm   [SWAP]
	└─fedora-home    253:3    0 181.7G  0 lvm   /home
    sdb                   8:16   1  14.6G  0 disk  
    └─sdb1                8:17   1  14.6G  0 part

The device identifiers are listed in the left column, each beginning with `sd` and ending with a letter, beginning with `a`. Each partition of each drive gets a number assigned to it, starting with `1`. For example, the second partition of the first drive is `sda2`. If you're not sure what a partition is, that's OK, this article covers it later.

The **lsblk** command is nondestructive. You can run it at any time without any fear of ruining data on a drive. It's exclusively a tool for probing.


## dmesg

If in doubt, you can test device label assignments by looking at the tail end of the **dmesg** command, which displays recent system log entries, kernel events (such as attaching and unattaching a drive) included. For instance, if you want to make sure that a thumb drive is *really* /dev/sdc, then plug the drive into your computer and then run this dmesg command:

    $ sudo dmesg | tail

The most recent drive listed is the one you just plugged in. If you unplug it and run that command again, you'll see that the device has been removed. If you plug it in again and then run the command, the device is added. In other words, you can monitor the kernel's awareness of your drive.


## File system

If all you needed was the device label, then your work is done. You have the device label and can use it to accomplish your goal.

If your goal is to create a usable drive, then you must give the drive a file system.

If you're wondering what a file system is, then it's probably easier to understand the concept by first learning what happens when you have no file system at all. If you have a spare drive that has *no important data on it whatsoever*, then you can follow along with this example. Otherwise, do **not** attempt this exercise, because it will DEFINITELY ERASE DATA, *by design*.

As it happens, it's actually possible to utilize a drive without a file system. Once you have definitely correctly identified a drive, and you have absolutely verified that there is nothing important on it, plug the drive into your computer but do not mount it. If it auto-mounts, then try unmount it manually:

    $ su -
    # umount /dev/sdx{,1}

To safeguard you from disastrous copy-paste errors, these examples use the unlikely `sdx` label for the example drive.

Now that the drive is unmounted, try this:

    # echo 'hello world' > /dev/sdx

You have just written data to the block device without it even being mounted on your system, much less with a file system.

To retrieve the data you just wrote, you can view the raw data on the drive:

    # head -n 1 /dev/sdx
    hello world

That seemed to work pretty well, but imagine that the phrase 'hello world' is one file. If you want to write a new 'file' using this method, you must:

1. know there's already an existing 'file' on line 1
2. know that the existing 'file' takes up only 1 line
3. derive a way to append new data, or else rewrite line 1 whilst writing line 2

For example:

    # echo 'hello world
    > this is a second file' > /dev/sdx

To get the first file, nothing changes:

    # head -n 1 /dev/sdx
    hello world

But it's more complex to get the second file:

    # head -n 2 /dev/sdx | tail -n 1
    this is a second file

Obviously this method of writing and reading data is not practical, and so developers have created systems to keep track of what constitutes a file, where one file begins and ends, and so on.

Most file systems require a partition.

## Partition

A partition on a hard drive is a sort of boundary on the device telling each file system what space it can occupy. For instance, if you have a 4 GB thumb drive, you can have a partition on that device taking up the entire drive (4 GB), or two partitions that each take 2 GB (or 1 and 3, if you prefer), or three of some variation of sizes, and so on. The combinations are nearly endless.

Assuming your drive is 4 GB, you can create one big partition from a terminal with the GNU **parted** command:

    # parted /dev/sdx --align opt mklabel msdos 0 4G

This command specifies the device path first, as required by parted.

The **--align** option lets parted find the optimal starting and stopping point of the partition.

The **mklabel** command creates a partition table (called a *disk label*) on the device. This example uses the **msdos** label because it's a very compatible and popular label, although **gpt** is getting more common, lately.

The desired start and end points of the partition are defined last. Since the **--align opt** flag is used, parted will adjust the size as needed to optimize drive performance, but these numbers serve as a guideline.

Next, create the actual partition. If your choice of start and end are not optimal, parted warns you and lets you adjust:

    # parted /dev/sdx -a opt mkpart primary 0 4G
    Warning: The resulting partition is not properly aligned for best performance: 1s % 2048s != 0s
    Ignore/Cancel? C                                                          
    # parted /dev/sdx -a opt mkpart primary 2048s 4G

If you run **lsblk** again (you may have to unplug the drive and plug it back in), you see that your drive now has one partition on it.


## Manually creating a file system

There are many file systems. Some are free and open source, while others are not. Some companies historically decline to support open source file systems, so their users can't read from open file systems while open source users can't read from closed ones without reverse engineering them.

This disconnect notwithstanding, there are lots of file systems you can use, and the one you choose depends on the drive's purpose. If you want a drive to be compatible across many systems, then your only choice, at the time of writing, the exFAT file system. Microsoft has not submitted exFAT code to any open source kernel, so you may have to install exFAT support with your package manager, but support for exFAT is included in both Windows and Mac OS.

Once you have exFAT support installed, you can create an exFAT file system on your drive, in the partition you created:

    # mkfs.exfat -n myExFatDrive /dev/sdx1

Now your drive is readable and writable by closed systems, and by open source systems utilizing additional (and as-yet nonsanctioned by Microsoft) kernel modules.

A common file system native to Linux is [ext4](https://opensource.com/article/17/5/introduction-ext4-filesystem). It's arguably a troublesome file system for portable drives, since it retains user permissions, which are often different from one computer to another, but it's a reliable and flexible file system in general. As long as you're comfortable managing permissions, ext4 is a great, journaled file system for portable drives:

    # mkfs.ext4 -L myExt4Drive /dev/sdx1 

Unplug your drive and then plug it back in. For ext4 portable drives, use sudo to create a directory and then grant permission to that directory to a user and group common across your systems. If you're not sure what user and group is best, you can instead either modify read and write permissions with sudo or root on the system having trouble with the drive.

## Desktop tools

It's great to know how to deal with drives with nothing but a Linux shell between you and the block device, but sometimes you just want to get a drive ready to use without so much insightful probing. There are excellent tools from both the GNOME and KDE developers to make your drive prep easy.

GNOME Disks and KDE Partition Manager are both graphical interfaces providing an all-in-one solution for everything this article has explained so far. Launch either of these applications for a list of attached devices (in the left column). You can create or resize partitions, and then create a file system.

![KDE Partition Manager](kdepartition.jpeg)

The GNOME version is, predictably, simpler than the KDE version, so I'll demo the more complex one here and you're sure to be able to figure out the GNOME one if that's what you have handy.

To repeat the same actions as this article has reviewed, launch the KDE Partition Manager and enter your root password.

From the left column, select the disk you want to format. If your drive isn't listed there, make sure it's plugged in and then select **Tools** > **Refresh devices** (F5 on your keyboard).

*Don't continue unless you're happy to destroy the existing partition table of the drive.*  With the drive selected, click the **New Partition Table** button in the top toolbar. You are prompted to select the label you want to give the partition table: either GPT or msdos. The former is more flexible and can handle larger drives, while the latter is, like many Microsoft technologies, the de facto standard by force of market share.

Now that you have a fresh partition table, right-click on your device in the right panel and select **New** to create a new partition. Follow the prompts to set the type and size of your partition. This action combines the partitioning step with creating a file system.

![Create a new partition](newpartition.jpeg)

To apply your changes to the drive, click the **Apply** button in the top left corner of the window.


## Hard drives, easy drives

Dealing with hard drives is easy on Linux, and it's even easier if you understand the language of hard drives. Since switching to Linux, I've been better equipped to prepare drives in whatever way I want them to work for me, and also to recover lost data, just because of the transparency Linux provides when dealing with storage.

Here are a final few tips, should you choose to learn more about hard drives through experimentation:

1. Backup your data, and not just the data on the drive you're experimenting with. All it takes is one wrong move to destroy the partition of an important drive (which is a great way to learn about recreating lost partitions, but not much fun).
2. Verify and then re-verify that the drive you are targeting is the correct drive. I use **lsblk** frequently to make sure that I haven't moved drives around on myself (it's easy to remove two drives from two separate USB ports, and then mindlessly reattach them in a different order, causing them to get new drive labels).
3. Take the time to "destroy" a test drive and see if you can recover the data. It's a good learning experience to recreate a partition table, or to try to get data back after a file system has been removed.

For extra fun, if you have a closed OS lying around, try getting an open source filesystem working on it. There are a few projects working toward this kind of compatibility, and trying to get them working in a stable and reliable way is a good weekend project.

Have fun!