Rawr!!!

Slow ZFS zpool?

October 6th, 2010 by

IMAG0122

Is a slow zfs pool giving you the blue’s? Well it sure was for me, and it took me forever to figure out the problem. I scoured the internet for many solutions, many of which didn’t work, and it wasn’t until I used my brain did I figure out what was happening ;) Hopefully this post can reduce your frustrations, and help you resolve your slow zfs pool easily.
I checked so many things, and in the end it was something very simple. But lets start with all the things that I tried & didn’t work. These are common problems with zfs when you use cheap consumer hardware. I’m using Western Digital Green Drives, and this is the price that I have to pay.

Wrong Direction:

  • TLER – First I thought that Time-Limited Error Recovery (TLER) was dramatically slowing the drives down. I had heard of this happening & went off to google to find some solutions. The wikipedia page << explains that WDTLER.EXE can help, so I tried that on my warrior boot usb (I’ll post this later), and it didn’t work for me, it just hung. So I started up another thing on my boot usb (PartedMagic) & went the trusty ol’ Open Source route with smartmontools. You need to download from svn & compile manually to get the new features. Then run the following:
smartctl -l scterc /dev/sda # Query the drive for TLER status
smartctl -l scterc,70,70 /dev/sda # This will set TLER to 7 seconds (default)
# You get this if your drive can't do TLER (bummer)
Warning: device does not support SCT Error Recovery Control command

# This is what you will get if your drive has TLER enabled (hooray)
SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)
  • Then I thought “You have 100% blocking”:

# iostat -xnz
<pre>
<pre>                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0   87.0    0.0 2878.1  0.0  0.0    0.0    0.4   0 100 c4t0d0
    0.0   83.0    0.0 2878.1  0.0  0.1    0.2    0.7   1  50 c4t1d0
    1.0    0.0   28.0    0.0  0.0  0.0    0.0    5.4   0   1 c4t2d0

See disk c4t0d0 is at 100% blocking :( Anywho here is a fix for that:

su - # enter password
echo zfs_vdev_max_pending/W0t1 | mdb -kw
echo "set zfs:zfs_vdev_max_pending=1" >> /etc/system

This is normally the case of ZFS using incorrect scheduling & sending your cheap sata disks too many tasks & overloading them. Thankfully I didn’t have blocking, but I made the patch anyways since I’m using WD Green drives.

  • “Your drives are idling too much? (clicking sound)”

Fire up smartctl again & run the following:


smartctl -a /dev/rdisk1 | grep ID ; smartctl -a /dev/rdisk1 | grep Start

# This is what I got on my macbook pro - 320GB caviar black drive (it's my secondary)

[jgerold@jgerold-13mbp.oc.cox.net ~]$ smartctl -a /dev/rdisk1 | grep ID ; smartctl -a /dev/rdisk1 | grep Start
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 4 Start_Stop_Count        0x0032   099   099   000    Old_age   Always       -       1770

Should think about maybe replacing this soon?

Not necessarily, upon looking at the PDF from Western Digital I have 600,000 spin up/down to look forward to on this disk. I’m not sure if RAW_VALUE equates to actual spin up/down & that smartctl is just being funky?

But if you have a lower value, then I wouldn’t worry; yet if your count is high & ever growing then I would recommend you boot into a windows instance (use bartpe of some sort), and grab a copy of WDIDLE.EXE (http://webdiary.com/i/?p=515) and turn off idleing wdiddle /d or something like that? I’m not sure I haven’t tried it.

Right Direction (This is what fixed it for me):

    I removed a horrible disk from my pool!
zpool offline ambry c0d4

I wish I had the readout for my `iostat -mX’ but my forth drive had a 3285ms timeout and was making the whole pool slow down to a crawl because it was failing (but not dead). I was noticing full disk usage with I would do an ls on a small dir, or copy data, or anything that required disk use. The crazy thing is that it took about 3 days for the array to recover once I removed the bad disk from the pool. Me being ever curious would run the following to monitor how fast the drives were:

while true; iostat -mX /dev/{ct01,ct02,ct03}; sleep 1; done

I joyfully watched my disks go from about .5 MB/s to now over 120MB/s (per disk)

Afterthoughts:

I wish I would have ran an iostat before trying all the things that I did :( But everything looks better in hindsight. I’m just glad that this is fixed and I was able to make a secondary backup, just in case anything else goes wrong before I get my new server up.

Sources:
- http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery
- http://letsgetdugg.com/2009/10/21/zfs-slow-performance-fix
- http://www.csc.liv.ac.uk/~greg/projects/erc
- http://webdiary.com/i/?p=515

Upgrade OpenSolaris to OpenIndiana

OpenIndiana was just released a few days ago (I just figured out today) > Anywho who wants to be up to the most bleeding up to dateness? Well I certainly do, so here’s how:


Last login: Wed Sep 15 23:15:07 2010 from jgerold-13mbp.o
Sun Microsystems Inc.   SunOS 5.11      snv_129 November 2008

# Add the new repository

<strong>pfexec pkg set-publisher --non-sticky opensolaris.org
pfexec pkg set-publisher -O http://pkg.openindiana.org/dev openindiana.org
pfexec pkg set-publisher -P openindiana.org</strong>

# Unset the previous publisher

$ <strong>pkg publisher</strong> # List the current Publishers

PUBLISHER                             TYPE     STATUS   URI
openindiana.org          (preferred)  origin   online   http://pkg.openindiana.org/dev/
opensolaris.org          (non-sticky) origin   online   http://pkg.opensolaris.org/dev/

# Remove the old publisher
<strong>pfexec pkg unset-publisher opensolaris.org</strong>

# Checking to make sure everything went smoothly

$ <strong>pkg publisher</strong>
PUBLISHER                             TYPE     STATUS   URI
openindiana.org          (preferred)  origin   online   http://pkg.openindiana.org/dev/

# Run your update, and enjoy your new repository:

$ <strong>pfexec pkg image-update</strong>

Static IP OpenSolaris

Setting a static IP in OpenSolaris is similar to Linux, but yet again; differences arise…

vi /etc/resolve.conf (make sure dns is correct [nameserver 10.0.0.1]
vi /etc/nwam/llp [change 'gani0 dhcp' to 'gani0 static 10.0.0.100' where gani0 is your interface, and 10.0.0.100 is the ip address you would like to assign]
# restart network
vi /etc/defaultrouter (add your router '10.0.0.1')
svcadm enable svc:/network/physical:default
svcadm restart svc:/network/physical:nwam

This is a quickie but a goodie…

Resources:

http://briancline.org/read/file_server_3_tweaking_opensolaris

Install VirtualBox in OpenSolaris

Virtualbox is awesome, and I think I might go with an all inclusive OpenSolaris server in the future. One that runs zfs, and all my NFS storage needs, but also has virtual machines for the other servers that I would like to run.

Here is how to install VirtualBox on Opensolaris:

Get the latest version from here: http://www.virtualbox.org/wiki/Downloads

wget http://download.virtualbox.org/virtualbox/3.0.12/VirtualBox-3.0.12-54655-SunOS.tar.gz
tar -xzf VirtualBox-3.0.12-54655-SunOS.tar.gz
pkgadd -d VirtualBoxKern-3.0.12-54655.pkg</pre>
pkgadd -d VirtualBox-3.0.12-54655.pkg

ZFS Dedup | OpenSolaris repos | beadm

November 22nd, 2009 by

DEDUP IS AMAZING. Can’t wait to try it out, but I’m waiting for my system to upgrade from the dev repository, err, I might just hold off cause it’s taking forever. Maybe I’ll play with it in a VM, instead of my Home NAS :*
Anywho please read up on it here: http://blogs.sun.com/bonwick/entry/zfs_dedup

It’s an amazing technology, and I’m super happy that it’s finally out.

——

Just in case you want to upgrade 111b to 127, or just have the most up to date packages try the following:

pkg set-publisher -O http://pkg.opensolaris.org/dev "opensolaris.dev"
pkg set-publisher -P opensolaris.dev

Now when you look at your repos, dev is default or ‘preferred’

fsk141@TrayNAS:/mnt$ pkg publisher
PUBLISHER                             TYPE     STATUS   URI
opensolaris.dev          (preferred)  origin   online   http://pkg.opensolaris.org/dev/
opensolaris.org                       origin   online   http://pkg.opensolaris.org/release/

——

As I work with OpenSolaris I am continually amazed, and pissed off at the same time. Where some things are insanely simple and straightforward. Other things are tedious, and hard because of my Linux background. Another thing that I am letdown is the lack of community like Arch Linux. There is a very large OpenSolaris community, but not in the sense of packages, and a lot is left to you (as in manually compile)
——

One of the difficult concepts to grasp is the boot environment concept. It’s kinda like a snapshot of sorts for your boot environment. Since OpenSolaris isn’t a rolling distro like Arch Linux, there are different revisions. Well instead of forcing some difficult reinstall you can simply ‘pkg image-update’ and upon your next reboot you have your new environment. You also have the ability to easily revert.

fsk141@TrayNAS:/mnt$ beadm list
BE             Active Mountpoint Space   Policy Created
--             ------ ---------- -----   ------ -------
TrayNAS_Nov_21 -      -          304.15M static 2009-11-21 15:53
opensolaris    -      -          7.57M   static 2009-10-16 08:19
opensolaris-1  NR     /          3.62G   static 2009-11-22 01:44

fsk141@TrayNAS:/mnt$ beadm destroy TrayNAS_Nov_21
Are you sure you want to destroy TrayNAS_Nov_21? This action cannot be undone(y/[n]): y

beadm is a great tool to modify your BE, and I love the simplicity in the BE scheme.

Speed up Open Solaris Boot (Fix Slow Boot)

November 22nd, 2009 by

It’s so annoying to wait 5+ minutes for Open Solaris to boot. Come to find out it’s because of the stupid graphical boot on startup. Damn gui’ness

1) Fix Grub menu.lst

  • Edit /rpool/boot/grub/menu.lst
  • Comment the following
splashimage
foreground
background
  • Remove ‘console=graphics’ from the ‘kernel’ line

You should end up with something like this:


splashimage /boot/grub/splash.xpm.gz
background 215ECA
timeout 30
default 2
#---------- ADDED BY BOOTADM - DO NOT EDIT ----------
title OpenSolaris 2009.06
findroot (pool_rpool,0,a)
bootfs rpool/ROOT/opensolaris
#splashimage /boot/solaris.xpm
#foreground d25f00
#background 115d93
#kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
module$ /platform/i86pc/$ISADIR/boot_archive
#---------------------END BOOTADM--------------------
title TrayNAS_Nov_21
findroot (pool_rpool,0,a)
bootfs rpool/ROOT/TrayNAS_Nov_21
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
module$ /platform/i86pc/$ISADIR/boot_archive
#============ End of LIBBE entry =============
title opensolaris-1
findroot (pool_rpool,0,a)
bootfs rpool/ROOT/opensolaris-1
#splashimage /boot/solaris.xpm
#foreground d25f00
#background 115d93
#kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
module$ /platform/i86pc/$ISADIR/boot_archive
#============ End of LIBBE entry =============

Resources:

ZFS Pool (Setup & Configure)

November 21st, 2009 by

ZFS is the best option for a NAS system. It makes for a super simple home NAS, and does away with a lot of config hassles. This is due to the fact that a lot of core services are built into zfs, or into the zfs stack rather. To setup any “extra” of zfs, all you need is a one line zfs command. (zfs set sharenfs=on <pool> == instant nfs share)

1) Create the Pool

We need to fetch the disk ID’s (the number after the disk number)

format &amp;lt;/dev/null

So if we run that command I can pull out (c8d0 c8d1 c9d0 c9d1) as shown below:

root@TrayNAS:~# format &amp;lt;/dev/null

Searching for disks...done

AVAILABLE DISK SELECTIONS:
0. c7d0 &amp;lt;DEFAULT cyl 3904 alt 2 hd 128 sec 32&amp;gt;
/pci@0,0/pci-ide@1f,1/ide@0/cmdk@0,0
1. c8d0 &amp;lt;DEFAULT cyl 60797 alt 2 hd 255 sec 252&amp;gt;
/pci@0,0/pci-ide@1f,2/ide@0/cmdk@0,0
2. c8d1 &amp;lt;DEFAULT cyl 60798 alt 2 hd 255 sec 189&amp;gt;
/pci@0,0/pci-ide@1f,2/ide@0/cmdk@1,0
3. c9d0 &amp;lt;DEFAULT cyl 60798 alt 2 hd 255 sec 189&amp;gt;
/pci@0,0/pci-ide@1f,2/ide@1/cmdk@0,0
4. c9d1 &amp;lt;DEFAULT cyl 60798 alt 2 hd 255 sec 189&amp;gt;
/pci@0,0/pci-ide@1f,2/ide@1/cmdk@1,0
Specify disk (enter its number):

Take the extracted disk ID’s (c8d0 c8d1 c9d0 c9d1) and apply them to the next command. Edit accordingly depending on what pool type you would like. (mirror, raidz1, raidz2)


zpool create -f ambry raidz1 c8d0 c8d1 c9d0 c9d1

I had to use -f because I have 3x 1.5TB drives, and one 2TB drive…

Now that we have the pool created, lets marvel at what we accomplished:


root@TrayNAS:~# zpool list ambry
NAME    SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
ambry  5.44T   137K  5.44T     0%  ONLINE  -
root@TrayNAS:~# zfs list ambry
NAME    USED  AVAIL  REFER  MOUNTPOINT
ambry  95.8K  4.00T  28.4K  /ambry

The difference in size is just one of those little quirks (zpool == raw disks, zfs == real space)

Another check just to make sure were all good, and using raidz1 (1 parity disk)

root@TrayNAS:~# zpool status ambry
  pool: ambry
 state: ONLINE
 scrub: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	ambry       ONLINE       0     0     0
	  raidz1    ONLINE       0     0     0
	    c8d0    ONLINE       0     0     0
	    c8d1    ONLINE       0     0     0
	    c9d0    ONLINE       0     0     0
	    c9d1    ONLINE       0     0     0

errors: No known data errors

Everything looks good, lets setup some filesystems, sharing, compression, and copy over our data…

For the rest of the write up, I’ll just output the commands I used to create my environment:

root@TrayNAS:~# zfs create ambry/Media
root@TrayNAS:~# zfs create ambry/Downloads
root@TrayNAS:~# zfs list | grep ambry
ambry                       176K  4.00T  31.4K  /ambry
ambry/Downloads            28.4K  4.00T  28.4K  /ambry/Downloads
ambry/Media                28.4K  4.00T  28.4K  /ambry/Media
root@TrayNAS:~# zfs get all ambry
NAME   PROPERTY              VALUE                  SOURCE
ambry  type                  filesystem             -
ambry  creation              Sat Nov 21 12:48 2009  -
ambry  used                  176K                   -
ambry  available             4.00T                  -
ambry  referenced            31.4K                  -
ambry  compressratio         1.00x                  -
ambry  mounted               yes                    -
ambry  quota                 none                   default
ambry  reservation           none                   default
ambry  recordsize            128K                   default
ambry  mountpoint            /ambry                 default
ambry  sharenfs              off                    default
ambry  checksum              on                     default
ambry  compression           off                    default
ambry  atime                 on                     default
ambry  devices               on                     default
ambry  exec                  on                     default
ambry  setuid                on                     default
ambry  readonly              off                    default
ambry  zoned                 off                    default
ambry  snapdir               hidden                 default
ambry  aclmode               groupmask              default
ambry  aclinherit            restricted             default
ambry  canmount              on                     default
ambry  shareiscsi            off                    default
ambry  xattr                 on                     default
ambry  copies                1                      default
ambry  version               3                      -
ambry  utf8only              off                    -
ambry  normalization         none                   -
ambry  casesensitivity       sensitive              -
ambry  vscan                 off                    default
ambry  nbmand                off                    default
ambry  sharesmb              off                    default
ambry  refquota              none                   default
ambry  refreservation        none                   default
ambry  primarycache          all                    default
ambry  secondarycache        all                    default
ambry  usedbysnapshots       0                      -
ambry  usedbydataset         31.4K                  -
ambry  usedbychildren        144K                   -
ambry  usedbyrefreservation  0                      -

I just created two filesystems to store my data, and then listed the zfs attributes available to me. I would like to take advantage of a few attributes, namely (compression, nfs, snapshots)

root@TrayNAS:~# zfs set sharenfs=rw,anon=0 ambry #allows root access from all hosts
root@TrayNAS:/# zfs set compression=on ambry/Media
root@TrayNAS:/# zfs set compression=on ambry/Downloads
^^^ I could set compression on for the whole ambry device, but want a little more fine grained control. ^^^
mkdir -p /old_ambry/Media
mkdir /old_ambry/Downloads
mount 10.0.0.100:/ambry/Media /old_ambry/Media
mount 10.0.0.100:/ambry/Downloads /old_ambry/Downloads
--- copied all my date over ---

I set compression on before moving data since it doesn’t activate recursively. Also I could have used zfs cloning, yet I have no need for my previous snapshots, so it’s not necessary…

More to come soon, but at this point you should be able to have a fully functioning NAS, with nfs. Enjoy

NTP Open Solaris (Setup & Configure)

NTP is awesome, it’s been around forever, and it’s one of the first things I setup on any new machine. This is an especially important service for a NAS device, since sync times are crucial, and there are many important services that are time dependent (snapshots for example).

1) Set the Time

ntpdate 0.pool.ntp.org

— Output —

root@TrayNAS:/etc/inet# ntpdate 0.pool.ntp.org
21 Nov 12:21:19 ntpdate[1296]: adjust time server 66.96.96.29 offset -0.000521 sec

2) Configure

cp /etc/inet/ntp.server /etc/inet/ntp.conf
echo "server 0.pool.ntp.orgnserver 1.pool.ntp.orgnserver 2.pool.ntp.org" >> /etc/inet/ntp.conf
svcadm enable ntp
svcs ntp

— Output —

root@TrayNAS:/etc/inet# svcs ntp
STATE          STIME    FMRI
online         12:12:57 svc:/network/ntp:default

Woot, time is setup. Enjoy synchronized time…

Resources:

Opensolaris blast on the way.

I was planning to put one giant post on how to setup a NAS with Open Solaris & NAS. Yet I think it would be nice to have a few key posts, and then group them together in the end. I just installed Open Solaris on my new machine. And am starting the setup… More to come soon.