Home NAS Expansion – Continued

In previous articles, I investigated using ZFS to build a home NAS, which is very simple and extremely effective.

But in this article, I discussed expanding that NAS by changing out disks one by one. While it’s easy to change the disks out and increase the size of your zpool, it doesn’t quite work because as your filesystem was created as raidz1, it will not increase in size.

That means that when you want to increase the size of your NAS down the track, you either have to copy all the files from your NAS on to an external device, then delete and recreate your zfs filesystem, or consider a different way of building the NAS in the first place.

So today I’m going to look at building a NAS with the same functionality as before, but with the ability for the disks to be easily changed out down the track to easily increase the size of your home storage system.

First thing I’ll need is a clean install of OpenSolaris. As before, I’ve done my install in VMWare so that hardware addition is quick and easy.

Now let’s say that I’m just setting this system up for the first time. I’ve got out and bought four 1Tb disks which I intend to connect to my NAS and serve out data to my house. I’ll create four 10Gb disks in VMWare to represent them.

Note that I’ve added all the disks to virtual SCSI controller 1, and that the system hard disk is on IDE. When we boot back in to Solaris and check the format tool, you’ll see that these locations are represented by the identifiers on each of the disks:

root@solaris1:~# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
 0. c7d0
 /pci@0,0/pci-ide@7,1/ide@0/cmdk@0,0
 1. c10t0d0
 /pci@0,0/pci1000,30@11/sd@0,0
 2. c10t1d0
 /pci@0,0/pci1000,30@11/sd@1,0
 3. c10t2d0
 /pci@0,0/pci1000,30@11/sd@2,0
 4. c10t3d0
 /pci@0,0/pci1000,30@11/sd@3,0
Specify disk (enter its number): ^C

So the C7 at the start of the 8Gb disk is the IDE controller, and C10 is the designation for the SCSI controller which the 10Gb disks are running on, the T number after that is essentially the disk number.

Now we’ll go ahead and create our NAS. But instead of a raidz array as we’ve created before, I’m going to start with two disks like so:

root@solaris1:~# zpool create naspool c10t0d0 c10t1d0
root@solaris1:~# zpool status naspool
 pool: naspool
 state: ONLINE
 scrub: none requested
config:

 NAME        STATE     READ WRITE CKSUM
 naspool     ONLINE       0     0     0
   c10t0d0   ONLINE       0     0     0
   c10t1d0   ONLINE       0     0     0

errors: No known data errors

I now have a 20Gb array of two disks at my disposal. But while I may have some storage space, there’s absolutely nothing to stop one of these disks dying and destroying my data. Now I’ll add the second disks one by one to mirror the two disks I’ve already attached.

root@solaris1:~# zpool attach naspool c10t0d0 c10t2d0
root@solaris1:~# zpool attach naspool c10t1d0 c10t3d0
root@solaris1:~# zpool status naspool
 pool: naspool
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Sat Feb  6 11:26:06 2010
config:

 NAME         STATE     READ WRITE CKSUM
 naspool      ONLINE       0     0     0
   mirror     ONLINE       0     0     0
     c10t0d0  ONLINE       0     0     0
     c10t2d0  ONLINE       0     0     0
   mirror     ONLINE       0     0     0
     c10t1d0  ONLINE       0     0     0
     c10t3d0  ONLINE       0     0     0  42.5K resilvered

errors: No known data errors

Now we have two 10Gb mirrors which are being concatenated into one 20Gb array. I’ll quickly create some filesystems on the array:

root@solaris1:~# zfs create -o casesensitivity=mixed naspool/music
root@solaris1:~# zfs create -o casesensitivity=mixed naspool/photos
root@solaris1:~# zfs create -o casesensitivity=mixed naspool/movies
root@solaris1:~# zfs list
NAME                      USED  AVAIL  REFER  MOUNTPOINT
naspool                   168K  19.6G    22K  /naspool
naspool/movies             19K  19.6G    19K  /naspool/movies
naspool/music              19K  19.6G    19K  /naspool/music
naspool/photos             19K  19.6G    19K  /naspool/photos

So now we have our NAS and it’s functioning properly. It is secure in that if a disk fails we can simply replace it. So what happens if we fill up our NAS, and decide that we want to make it larger? Let’s simulate filling the array:

root@solaris1:~# cd /naspool/movies/
root@solaris1:/naspool/movies# mkfile 19g 19Gb_File
root@solaris1:/naspool/movies# zfs list
NAME                      USED  AVAIL  REFER  MOUNTPOINT
naspool                  19.0G   574M    23K  /naspool
naspool/movies           19.0G   574M  19.0G  /naspool/movies
naspool/music              19K   574M    19K  /naspool/music
naspool/photos             19K   574M    19K  /naspool/photos

Oh dear. We’re quickly running out of space, and it’s obvious that we’ll need to go out and grab ourselves some new disks to increase the storage space on our server.

So let’s say that I head down to my local computer hardware store and buy two 2Tb disks. The first thing I need to do is to remove the mirroring from one of my sets of disks.

root@solaris1:~# zpool detach naspool c10t3d0
root@solaris1:~# zpool status naspool
 pool: naspool
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Sat Feb  6 11:26:06 2010
config:

 NAME         STATE     READ WRITE CKSUM
 naspool      ONLINE       0     0     0
   mirror     ONLINE       0     0     0
     c10t0d0  ONLINE       0     0     0
     c10t2d0  ONLINE       0     0     0
   c10t1d0    ONLINE       0     0     0

errors: No known data errors

Note that while the first two disks are still mirrored, c10t1d0 is now running on its own. I’ll now shut down my computer and replace c10t3d0 (the device I just detached from the array) with a new, bigger disk. I could just add all the disks to VMWare to start with, but I believe this better represents how things are in the real world – where we have a limited amount of SATA connections on our motherboards 😉

So, here’s the old disk, which I’ll remove:

And when I create the new one, I’ll specify its SCSI location so it’s in the same place as the old disk.

So it will appear in the same location as the disk I just removed, but will now be 20Gb instead of 10Gb. This simulates replacing a 1Tb disk with a 2Tb.

When I boot back into Solaris, it will appear in format with the same identifier as before, but will now be twice the size:

root@solaris1:~# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
 0. c7d0 cyl 4092 alt 2 hd 128 sec 32>
 /pci@0,0/pci-ide@7,1/ide@0/cmdk@0,0
 1. c10t0d0 -VMware Virtual S-1.0-10.00GB>
 /pci@0,0/pci1000,30@11/sd@0,0
 2. c10t1d0 -VMware Virtual S-1.0-10.00GB>
 /pci@0,0/pci1000,30@11/sd@1,0
 3. c10t2d0 <VMware,-VMware Virtual S-1.0-10.00GB>
 /pci@0,0/pci1000,30@11/sd@2,0
 4. c10t3d0 <DEFAULT cyl 2608 alt 2 hd 255 sec 63>
 /pci@0,0/pci1000,30@11/sd@3,0
Specify disk (enter its number): ^C

So now that we have our larger disk up and running, we’ll use it to replace c10t1d0. Doing so will cleanly increase the size of our NAS by 10Gb (1Tb in the real world).

Before the operation our zpool looks like this:

root@solaris1:~# zpool status naspool
 pool: naspool
 state: ONLINE
 scrub: none requested
config:

 NAME         STATE     READ WRITE CKSUM
 naspool      ONLINE       0     0     0
   mirror     ONLINE       0     0     0
     c10t0d0  ONLINE       0     0     0
     c10t2d0  ONLINE       0     0     0
   c10t1d0    ONLINE       0     0     0

errors: No known data errors

Then we replace the disk:

root@solaris1:~# zpool replace naspool c10t1d0 c10t3d0

And Solaris begins the replacement process.

root@solaris1:~# zpool status naspool
 pool: naspool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
 continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 1.95% done, 0h2m to go
config:

 NAME         STATE     READ WRITE CKSUM
 naspool      ONLINE       0     0     0
   mirror     ONLINE       0     0     0
     c10t0d0  ONLINE       0     0     0
     c10t2d0  ONLINE       0     0     0
   replacing  ONLINE       0     0     0
     c10t1d0  ONLINE       0     0     0
     c10t3d0  ONLINE       0     0     0  189M resilvered

errors: No known data errors

After the resilvering completes, the c10t1d0 will be removed automatically, and c10t3d0 will take its place in the zpool.

root@solaris1:~# zpool status naspool
 pool: naspool
 state: ONLINE
 scrub: resilver completed after 0h2m with 0 errors on Sat Feb  6 12:07:00 2010
config:

 NAME         STATE     READ WRITE CKSUM
 naspool      ONLINE       0     0     0
   mirror     ONLINE       0     0     0
     c10t0d0  ONLINE       0     0     0
     c10t2d0  ONLINE       0     0     0
   c10t3d0    ONLINE       0     0     0  9.50G resilvered

errors: No known data errors

Now that’s done, all we need to do is shut down our machine and change out the 10Gb disk we just replaced with the second 20Gb disk we created earlier. As it was c10t1d0, we know that it’s on what VMWare considers SCSI 1:1.

The old disk:

The new disk:

When we boot up again, the new disk will be ready to add to our zpool to act as a mirror for c10t3d0.

root@solaris1:~# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
 0. c7d0 <DEFAULT cyl 4092 alt 2 hd 128 sec 32>
 /pci@0,0/pci-ide@7,1/ide@0/cmdk@0,0
 1. c10t0d0 <VMware,-VMware Virtual S-1.0-10.00GB>
 /pci@0,0/pci1000,30@11/sd@0,0
 2. c10t1d0 <DEFAULT cyl 2608 alt 2 hd 255 sec 63>
 /pci@0,0/pci1000,30@11/sd@1,0
 3. c10t2d0 <VMware,-VMware Virtual S-1.0-10.00GB>
 /pci@0,0/pci1000,30@11/sd@2,0
 4. c10t3d0 <VMware,-VMware Virtual S-1.0-20.00GB>
 /pci@0,0/pci1000,30@11/sd@3,0
Specify disk (enter its number): ^C

And we’ll use it as the new mirror for c10t1d0.

root@solaris1:~# zpool attach naspool c10t3d0 c10t1d0
root@solaris1:~# zpool status naspool
 pool: naspool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
 continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 1.75% done, 0h2m to go
config:

 NAME         STATE     READ WRITE CKSUM
 naspool      ONLINE       0     0     0
   mirror     ONLINE       0     0     0
     c10t0d0  ONLINE       0     0     0
     c10t2d0  ONLINE       0     0     0
   mirror     ONLINE       0     0     0
     c10t3d0  ONLINE       0     0     0
     c10t1d0  ONLINE       0     0     0  170M resilvered

errors: No known data errors

When it finishes resilvering, we’ll be back to a fully redundant state, and will have an extra 10Gb to play with on our zpool:

root@solaris1:~# zfs list
NAME                      USED  AVAIL  REFER  MOUNTPOINT
naspool                  19.0G  10.4G    23K  /naspool
naspool/movies           19.0G  10.4G  19.0G  /naspool/movies
naspool/music              19K  10.4G    19K  /naspool/music
naspool/photos             19K  10.4G    19K  /naspool/photos

Of course, our data is still there – Safe and sound in its slightly larger home.

root@solaris1:~# ls -lh /naspool/movies/
total 20G
-rw------T 1 root root 19G 2010-02-06 11:37 19Gb_File

If we wanted to, we could upgrade the first two disks in the same way – by removing the mirror, replacing the first disk, then adding a new mirror.

Conclusion.

So now that we’ve looked at the nuts and bolts of replacing disks in this sort of array, and the fact that the zfs filesystems expand correctly with this method, where does it leave us?

While raidz may seem like a great idea – striping and parity with a one disk redundancy – in practice it is better to use a flat zfs filesystem of two or more disks with extra disks for mirroring. The reasons for this are:

  1. It performs much better than a raidz array, purely because of the fact that the system doesn’t need to calculate and write parity bits all over the place whenever you’re moving data.
  2. It is much easier to upgrade or to replace a failed disk within the array. All you need to do is break the mirroring (if one of the disks had failed you would remove it first), then use the zpool replace command to add the new disk(s) to your array.

However, there is a drawback to this way of doing things. Because you need to have a mirror disk for each disk in your array, you will have slightly less storage at your disposal when compared with a raidz array. So four 1Tb disks would give roughly 2Tb of storage, as opposed to 3Tb with raidz.

But less storage is a small price to pay when it’s this easy to change out a failed disk or upgrade your array.

Advertisement

4 thoughts on “Home NAS Expansion – Continued

  1. Hello,

    I’ve been reading your blog for a while and I like it. Thanks for sharing your expertise 🙂

    I have a question please. I’m new to the NAS thing and actually I’m studying the best way to build my NAS.
    I’ve seen FreeNAS which I’m sure you know about. My question is, is it better to use FreeNAS or just install Open Solaris? … would I be able to use different types and sizes of Hard Disks?

    I’m trying to do something close to the Drobo. How can I do that??

    My question might be too broad, excuse me for that because I’m new to the storage techniques.

    Thanks again 🙂

  2. Well, I’m note sure I’d call it ‘expertise,’ but thanks for the compliment 🙂

    Finding something which emulated the Drobo with common PC hardware which I had lying around was precisely what I was looking for when I started playing with OpenSolaris. I wanted the ability to mix different speed and size disks and store my data safely, as well as replacing disks which had failed or upgrading to larger disks on the fly.

    While you can simulate certain aspects of the Drobo’s feature set with OpenSolaris, there are a few limitations at this stage.

    For example, a raidz-1 array (single disk redundancy) cannot be expanded by adding larger disks. If you make the array 2Tb when you first create it, it shall always be 2Tb until it is destroyed. You can replace the disks with larger ones, and have a second filesystem if you wish, but the original filesystem will remain the same size.

    If you built a simple dynamic striped array, you could have the flexibility of disk replacement and dynamic expansion, but you would have to sacrifice data security in order to achieve it.

    As for FreeNAS, I’m not exactly sure what options it has which may bring it a little closer to Drobo’s feature set, but I do know that it offers a simplified version of the ZFS filesystem – with no desktop environment and a few other web-based administration features to boot.

    I plan on looking into FreeNAS soon, and I might write an article or two.

    Thanks,
    Leigh.

  3. You said:
    “If you built a simple dynamic striped array, you could have the flexibility of disk replacement and dynamic expansion, but you would have to sacrifice data security in order to achieve it.”

    Could you tell me how to build the array you are talking about? What security features I lose?
    You mean I won’t have any redundancy ?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s