Expanding your OpenSolaris NAS

After yesterday’s guide on setting up a Solaris NAS, I figure the next logical questions would be:

  • How do I change out disks which have failed?
  • How do I change out smaller disks for larger ones?
  • Can I add more disks to my pool?

All three questions are quite easily answered, and can, for the most part, be done with a single tool.

First up, what happens when a disk fails. I’ve hot-removed one of the virtual hard disks from my array to simulate a failure and see what Solaris does.

root@opensolaris:/naspool/movies# zpool status naspool
 pool: naspool
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
 Sufficient replicas exist for the pool to continue functioning in a
 degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
 repaired.
 scrub: none requested
config:

 NAME        STATE     READ WRITE CKSUM
 naspool     DEGRADED     0     0     0
 raidz1    DEGRADED     0     0     0
 c8t1d0  ONLINE       0     0     0
 c8t2d0  ONLINE       0     0     0
 c8t3d0  ONLINE       0     0     0
 c8t4d0  FAULTED      3   953     0  too many errors

errors: No known data errors

Oh, dear. It looks like c8t4d0 has faulted and the pool is currently in a degraded state. But is our data still there?

root@opensolaris:/naspool/movies# ls -lah
total 29G
drwxr-xr-x 2 astro root    5 2009-10-10 10:01 .
drwxr-xr-x 5 astro root    5 2009-10-09 16:01 ..
-rw------T 1 root  root  10G 2009-10-10 09:55 10gigfile
-rw------T 1 root  root  10G 2009-10-10 09:58 10gigfile2
-rw------T 1 root  root 9.0G 2009-10-10 10:01 9gigfile

Ok, my data is safe for the time being, but with one disk down I don’t have any room for error. I’ll have to replace that hard disk with a new one. First thing I’ll have to do is shut the system down so that we can add the new disk. While I’m adding a disk in VMWare’s hardware interface, imagine yourself crawling under your desk with a screwdriver and an anti static strap.

MWSnap027

Righto, so let’s say I had another 10Gb disk lying around, and I’ve popped it in to my server. Now all I need to do is tell Solaris that it’s there and that it should be used to replace the failed drive in my naspool array. So boot the system back up, log in, grab a command line, become root, and…

root@opensolaris:~# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c8t0d0
/pci@0,0/pci15ad,1976@10/sd@0,0
1. c8t1d0
/pci@0,0/pci15ad,1976@10/sd@1,0
2. c8t2d0
/pci@0,0/pci15ad,1976@10/sd@2,0
3. c8t3d0
/pci@0,0/pci15ad,1976@10/sd@3,0
4. c8t5d0
/pci@0,0/pci15ad,1976@10/sd@5,0
Specify disk (enter its number):

Ok, our new disk is c8t5d0, the next SCSI disk in the chain after the old failed disk. Let’s use zpool to replace c8t4d0 with c8t5d0.

root@opensolaris:~# zpool replace naspool c8t4d0 c8t5d0

Tough, huh? Ok, so how’s naspool looking?

root@opensolaris:~# zpool status naspool
pool: naspool
state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress for 0h0m, 4.26% done, 0h3m to go
config:
NAME           STATE     READ WRITE CKSUM
naspool        DEGRADED     0     0     0
raidz1       DEGRADED     0     0     0
c8t1d0     ONLINE       0     0     0
c8t2d0     ONLINE       0     0     0
c8t3d0     ONLINE       0     0     0
replacing  DEGRADED     0     0    45
c8t4d0   FAULTED      0     0     0  too many errors
c8t5d0   ONLINE       0     0     0  25.3M resilvered
errors: No known data errors

So naspool still degraded, but it is ‘resilvering‘ the information on to the new disk – copying all of the data and parity info so that we’ll be back to a fully redundant state. As it’s running, we can monitor it:

root@opensolaris:~# zpool status naspool
pool: naspool
state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress for 0h1m, 79.09% done, 0h0m to go
config:
NAME           STATE     READ WRITE CKSUM
naspool        DEGRADED     0     0     0
raidz1       DEGRADED     0     0     0
c8t1d0     ONLINE       0     0     0
c8t2d0     ONLINE       0     0     0
c8t3d0     ONLINE       0     0     0
replacing  DEGRADED     0     0    67
c8t4d0   FAULTED      0     0     0  too many errors
c8t5d0   ONLINE       0     0     0  4.71G resilvered
errors: No known data errors

And when it’s finished, we’ll see that the resilver is complete:

root@opensolaris:~# zpool status naspool
pool: naspool
state: ONLINE
scrub: resilver completed after 0h2m with 0 errors on Sat Oct 10 10:28:01 2009
config:
NAME        STATE     READ WRITE CKSUM
naspool     ONLINE       0     0     0
raidz1    ONLINE       0     0     0
c8t1d0  ONLINE       0     0     0
c8t2d0  ONLINE       0     0     0
c8t3d0  ONLINE       0     0     0
c8t5d0  ONLINE       0     0     0  6.73G resilvered
errors: No known data errors

As you can see, resilvering disks is nice and quick. My array is now fully redundant again, and can suffer another disk failure without missing a beat. So that covers how to replace a failed disk, what’s next?

The next step is to replace disks with larger ones to increase capacity on our array.

The method for doing this is pretty much identical to the way we replaced a failed disk – add a new disk, then tell Solaris to replace one with another. For example, I’ll add in a new 50Gb disk:

MWSnap025

Then I’ll fire up Solaris, log in and check format to see what its new ID is.

root@opensolaris:~# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
 0. c8t0d0
 /pci@0,0/pci15ad,1976@10/sd@0,0
 1. c8t1d0
 /pci@0,0/pci15ad,1976@10/sd@1,0
 2. c8t2d0
 /pci@0,0/pci15ad,1976@10/sd@2,0
 3. c8t3d0
 /pci@0,0/pci15ad,1976@10/sd@3,0
 4. c8t5d0
 /pci@0,0/pci15ad,1976@10/sd@5,0
 5. c9t0d0
 /pci@0,0/pci15ad,790@11/pci15ad,1976@3/sd@0,0
Specify disk (enter its number): ^C

So there it is as c9t0d0. Let’s replace the first disk in the array with this new 50Gb monster.

root@opensolaris:~# zpool replace naspool c8t1d0 c9t0d0
root@opensolaris:~# zpool status
 pool: naspool
 state: ONLINE
 scrub: resilver completed after 0h2m with 0 errors on Sun Oct 11 08:53:53 2009
config:

 NAME        STATE     READ WRITE CKSUM
 naspool     ONLINE       0     0     0
 raidz1    ONLINE       0     0     0
 c9t0d0  ONLINE       0     0     0  9.69G resilvered
 c8t2d0  ONLINE       0     0     0
 c8t3d0  ONLINE       0     0     0
 c8t5d0  ONLINE       0     0     0

errors: No known data errors

 pool: rpool
 state: ONLINE
 scrub: none requested
config:

 NAME        STATE     READ WRITE CKSUM
 rpool       ONLINE       0     0     0
 c8t0d0s0  ONLINE       0     0     0

errors: No known data errors

So now that our new disk is up and running, do a quick reboot and check the new size of your zpool.

root@opensolaris:~# zpool list
NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
naspool  39.8G  39.1G   636M    98%  ONLINE  -
rpool    7.94G  3.40G  4.53G    42%  ONLINE  -

Awesome. So let’s do the same for the other disks in the array:

root@opensolaris:~# zpool list
NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
naspool  39.8G  39.1G   636M    98%  ONLINE  -
rpool    7.94G  3.40G  4.53G    42%  ONLINE  -
root@opensolaris:~# zpool status naspool
 pool: naspool
 state: ONLINE
 scrub: none requested
config:

 NAME        STATE     READ WRITE CKSUM
 naspool     ONLINE       0     0     0
 raidz1    ONLINE       0     0     0
 c9t0d0  ONLINE       0     0     0
 c8t2d0  ONLINE       0     0     0
 c8t3d0  ONLINE       0     0     0
 c8t5d0  ONLINE       0     0     0

errors: No known data errors
root@opensolaris:~# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
 0. c8t0d0
 /pci@0,0/pci15ad,1976@10/sd@0,0
 1. c8t1d0
 /pci@0,0/pci15ad,1976@10/sd@1,0
 2. c8t2d0
 /pci@0,0/pci15ad,1976@10/sd@2,0
 3. c8t3d0
 /pci@0,0/pci15ad,1976@10/sd@3,0
 4. c8t5d0
 /pci@0,0/pci15ad,1976@10/sd@5,0
 5. c9t0d0
 /pci@0,0/pci15ad,790@11/pci15ad,1976@3/sd@0,0
 6. c9t1d0
 /pci@0,0/pci15ad,790@11/pci15ad,1976@3/sd@1,0
 7. c9t2d0
 /pci@0,0/pci15ad,790@11/pci15ad,1976@3/sd@2,0
 8. c9t3d0
 /pci@0,0/pci15ad,790@11/pci15ad,1976@3/sd@3,0
Specify disk (enter its number): ^C
root@opensolaris:~# zpool replace naspool c8t2d0 c9t1d0
root@opensolaris:~# zpool status naspool
 pool: naspool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
 continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 0.00% done, 0h0m to go
config:

 NAME           STATE     READ WRITE CKSUM
 naspool        ONLINE       0     0     0
 raidz1       ONLINE       0     0     0
 c9t0d0     ONLINE       0     0     0
 replacing  ONLINE       0     0     0
 c8t2d0   ONLINE       0     0     0
 c9t1d0   ONLINE       0     0     0  32.5K resilvered
 c8t3d0     ONLINE       0     0     0
 c8t5d0     ONLINE       0     0     0

errors: No known data errors

And go make a coffee until it’s finished resilvering, then add the next disk:

root@opensolaris:~# zpool status naspool
 pool: naspool
 state: ONLINE
 scrub: resilver completed after 0h2m with 0 errors on Sun Oct 11 10:00:17 2009
config:

 NAME        STATE     READ WRITE CKSUM
 naspool     ONLINE       0     0     0
 raidz1    ONLINE       0     0     0
 c9t0d0  ONLINE       0     0     0
 c9t1d0  ONLINE       0     0     0  9.78G resilvered
 c8t3d0  ONLINE       0     0     0
 c8t5d0  ONLINE       0     0     0

errors: No known data errors
root@opensolaris:~# zpool replace naspool c8t3d0 c9t2d0

And so on, until you’ve changed them all out, and then do a quick reboot to force the zpools to update to the new sizes.

root@opensolaris:~# zpool list
NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
naspool   200G  39.1G   161G    19%  ONLINE  -
rpool    7.94G  3.43G  4.51G    43%  ONLINE  -

However, at the time of writing, there seems to be a little problem where when your zpool is resized, your zfs filesystem isn’t. So even though my zpool is currently showing 200G of space, my zfs filesystem is still its original size:

root@opensolaris:~# zfs list
NAME                       USED  AVAIL  REFER  MOUNTPOINT
naspool                   29.3G   118G  32.9K  /naspool
naspool/movies            29.3G   118G  29.3G  /naspool/movies
naspool/music             28.4K   118G  28.4K  /naspool/music
naspool/photos            28.4K   118G  28.4K  /naspool/photos

I’m currently researching a solution to this problem which doesn’t involve creating a new filesystem and moving all the files over from the old one.

Update: It seems that once you’ve created a raidz filesystem, you cannot modify how large it is. But there are ways around it. Check this article for info.

Advertisement

5 thoughts on “Expanding your OpenSolaris NAS

  1. Hello,

    Thanks a lot for sharing this with us, this is really useful and saved me sometime which I wanted to do the same as you are doing and testing.

    But could you please let me know about the last part problem? That the zfs didn’t really show the new size?

    I’m researching for making my NAS computer…the thing I’m not sure about if I could use a combination of SATA and IDE drives of different sizes?

    Thanks a lot 🙂

    1. Hi Hasan,
      I’ve just posted a new story about expanding an Opensolaris NAS here. In a nutshell, you’re best off using a flat zfs filesystem, then adding mirror disks for security as raidz cannot be expanded once it has been created.

      Different sized IDE and SATA disks are fine, but getting mirroring happening could be problematic.

      Perhaps use raidz for the time being, but then aim to add a couple of big disks in a mirror down the track to replace the odd sized disks?

  2. Hi,
    I can’t see any problem at all. According to your screenshots you expanded the pool by replacing 10GB drives against 50GB drives and got a raw pool-capacity of 200GB instead of 40GB before. In your raidz-condition this leads to 4-1 times single drive capacity for the zpool (150G). And your “zfs list” shows 118GB available space and 29.3GB used space, so everything is fine: the poolsize expanded and the zfs filesystem expanded as well.
    For testing I put one 2GB and one 4 GB drive in a raidz1 (senseless for production, but for testing sufficient), filled up the pool, replaced the 2 GB drive against a 8GB drive, and after resilvering I got 4 GB usable space on the existing ZFS on the zpool instead of 2 GB before without reboot (tested via samba). So raidz1-expansion with replacing drives with bigger ones step by step works without any problem. This was tested under FreeBSD 8.2 with zpool version 15 and zfs version 4.
    Zfs filesystems use what they can get from the underlying zpool. If this pool is expanded by adding a second mirror, replacing mirror disks against bigger ones or replacing raidz disks by bigger ones is not relevant.

    Greets…

    1. Hi Andiz,

      Thanks for the info – it looks like they’ve updated ZFS (in FreeBSD at least, not sure about OpenSolaris). I’ll have to fire it up and have another look 🙂

Leave a Reply to andiz Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s