Configuring Active Directory authentication integration – Building an OpenIndiana based ZFS File Server – part 3

Getting Kerberos based authentication working with Active Directory is actually pretty simple, there’s numerous blogs out there on the topic, here, here, here and here, so I’m probably mostly covering old ground on the basic integration stuff.

Our Active Directory already has Schema extentions to hold Unix account data, initially Services for Unix (SFU), but we added the Server 2008 schema which adds RFC 2307 attributes which we now use for Linux authentication.

First off, I should point out that we have a disjoint DNS namespace for our AD and normal client DNS. i.e. our AD is socs-ad.cs.bham.ac.uk, but our clients are all in cs.bham.ac.uk. This shouldn’t really cause any problems for most people (I’ve only come across 3 cases, 1: back in about 2003 where a NetApp filer couldn’t work out the DNS name as it didn’t match the netbios name … fixed a long time ago, 2: about 18 months ago when experimenting with SCCM and AMT provisioning – it doesn’t support disjoint DNS, 3: with the OI in-kernel CIFS server … in part 5!).

First off, we need to configure some config files:

/etc/resolv.conf
  domain  cs.bham.ac.uk
  search cs.bham.ac.uk socs-ad.cs.bham.ac.uk
  nameserver  147.188.192.4
  nameserver  147.188.192.8
cp /etc/inet/ntp.client /etc/inet/ntp.conf
/etc/inet/ntp.conf
  server ad1.cs.bham.ac.uk ad2.cs.bham.ac.uk timehost.cs.bham.ac.uk

The first two are our DCs, the latter a general NTP servers – all should be pretty much in step though!

Finally, enable ntp:

svcadm enable ntp

To help with Kerberos principle generation, I grabbed a copy of adjoin. Note that because we have a disjoint namespace, I had to hack it a little otherwise it tries to add the full Windows domain to the hostname in the SPNs:

###fqdn=${nodename}.$dom
fqdn=bigstore.cs.bham.ac.uk
./adjoin -f

Check you have a correct looking machine principle file

klist -e -k /etc/krb5/krb5.keytab

And enable a couple of services we’ll need:

svcadm enable /network/dns/client
svcadm enable /system/name-service-cache

We also need to configure pam.conf to use Kerberos, so you need to add a couple of lines similar to:

 other   auth required           pam_unix_cred.so.1
 other auth sufficient pam_krb5.so.1 debug
 other   auth required           pam_unix_auth.so.1
 other   account requisite       pam_roles.so.1
 other account required pam_krb5.so.1 debug nowarn
 other   account required        pam_unix_account.so.1
 other   password requisite      pam_authtok_check.so.1
 other password sufficient pam_krb5.so.1 debug
 other   password required       pam_authtok_store.so.1

The debug is optional to help with debugging why things aren’t working. The nowarn on the middle example is needed to stop a password expiry warning on each password based login – our AD passwords are set to never expire, but without this, it warns about expiry in 9244 days.

We now need to edit a file to be able to allow us to configure LDAP for passwd/group data, we want to remove all references to ldap except for passwd, group and automount:

/etc/nsswitch.ldap
  passwd:     files ldap
  group:      files ldap
  hosts:      files dns
  ipnodes:    files dns
  automount:  files ldap

Before we go any further, we also need to tweak the krb5 config:

/etc/krb5.conf
  [libdefaults]
    default_tkt_enctypes = rc4-hmac arcfour-hmac arcfour-hmac-md5
    default_tgs_enctypes = rc4-hmac arcfour-hmac arcfour-hmac-md5
    permitted_enctypes = rc4-hmac arcfour-hmac arcfour-hmac-md5

You might not need to do this, our DCs are quite old running Server 2003, without this, Kerberos authentication wouldn’t work.

We now need to configure LDAP:

 ldapclient -v manual \
 -a credentialLevel=self \
 -a authenticationMethod=sasl/gssapi \
 -a defaultSearchBase=dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk \
 -a defaultSearchScope=sub \
 -a domainName=socs-ad.cs.bham.ac.uk \
 -a defaultServerList="ad1.socs-ad.cs.bham.ac.uk ad2.socs-ad.cs.bham.ac.uk" \
 -a attributeMap=passwd:gecos=ad1.socs-ad.cs.bham.ac.uk \
 -a attributeMap=passwd:homedirectory=unixHomeDirectory \
 -a attributeMap=passwd:uid=sAMAccountName \
 -a attributeMap=group:uniqueMember=member \
 -a attributeMap=group:cn=sAMAccountName \
 -a objectClassMap=group:posixGroup=group \
 -a objectClassMap=passwd:posixAccount=user \
 -a objectClassMap=shadow:shadowAccount=user \
 -a serviceSearchDescriptor='passwd:ou=people,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk?sub' \
 -a serviceSearchDescriptor='shadow:ou=people,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk?sub?memberOf=CN=sysop,OU=Groups of People,OU=Groups,DC=socs-ad,DC=cs,DC=bham,DC=ac,DC=uk' \
 -a serviceSearchDescriptor='group:ou=groups,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk?sub?(&(objectClass=group)(gidNumber=*))'

You’d need to tweak it for your environment of course. Importantly, you need the bit

(&(objectClass=group)(gidNumber=*))

for the group serviceSearchDescriptor, otherwise you’ll get spurious results if you have groups with no gidNumber assigned. Ideally we’d also have similar filters for passwd and shadow, but that didn’t seem to work properly.

Restart the nscd daemon:

svcadm enable name-service-cache

You could try doing an LDAP search with something like:

ldapsearch -R -T -h ad1.socs-ad.cs.bham.ac.uk -o authzid= -o mech=gssapi -b dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk -s sub cn=jaffle

and also getent should now work

getent passwd
getent group

Restricting login access

So, we’ve managed to integrate our password data on the server – pretty much we need to have access to all our directory users for NFS to work properly so that usernames and UIDs match, however this means anyone can login to the server. There’s no equivalent of Linux’s pam_access, and there doesn’t appear to be any native way of specifying Unix groups of people who can login to the system. The closest I found was pam_list, however this only works with netgroups, and as we don’t use these for anything anymore, they were never migrated to our AD, and anyway, we’ve got perfectly good Unix groups of people to use on our other systems.

After running round in circles for a while, and almost creating netgroups, I came across a solution that seems to work nicely. Its a bit of a hack, but actually it quite a nice solution for us. The key is in the ldapclient definition for shadow:

-a serviceSearchDescriptor='shadow:ou=people,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk?sub?memberOf=CN=sysop,OU=Groups of People,OU=Groups,DC=socs-ad,DC=cs,DC=bham,DC=ac,DC=uk'

Note that we add an LDAP filter requiring membership of a specific LDAP group, this means that shadow data is only present for the members of that group and hey presto – we’ve got Unix group based authorisation working on OpenIndiana. If you wanted multiple groups, you’d have to tweak the filter with some parenthesis and a | probably …

And automount/autofs?

Autofs from LDAP was a little bit more complicated to get going, probably complicated a bit as we have Linux format/named autofs maps in our AD. Technically we don’t need this bit working for our file-server, but we thought for completeness, we’d investigate the options for it.

First off, we’ll edit a couple of files:

/etc/auto_home
  #+auto_home

/etc/auto_master
  #
  +auto_master
  /bham   +auto.linux
  #+auto.master

A long time ago, we used NIS for Solaris and Linux – back then the maps had to be discrete – Linux autofs didn’t have nested/multi-mount and Solaris didn’t support nested mount. e.g. under Solaris we could do:

/bham
    ... /foo
    ... /baa
    ... /otherdir
            .... /foo
            .... /bar

But under Linux, it had to be:

/bham
    ... /foo
    ... /baa

and in another Linux map:

/bham
    ... /otherdir
            .... /foo
            .... /bar

When the NIS maps got carried over the LDAP for Linux when we rolled out Scientific Linux 6, this got carried over as well. Now of course, this won’t work with Solaris and neither does it work with OpenIndiana.

After a bit of consideration, we thought we were going to have to build a separate set of maps for OI again. But then we found that Linux autofs 5 now supports multi-map, so we can use traditional Solaris format maps for nested directories in a single map. A quick test and an early morning edit to the maps, and we can now use the same maps under both OSes.

ldapclient mod \
  -a "serviceSearchDescriptor=auto_master:cn=auto.master,OU=automount,OU=Maps,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk" \
  -a "serviceSearchDescriptor=auto.home:cn=auto.home-linux,OU=automount,OU=Maps,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk" \
  -a "serviceSearchDescriptor=auto.linux:cn=auto.linux,OU=automount,OU=Maps,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk" \
  -a "serviceSearchDescriptor=auto.home-linux:cn=auto.home-linux,OU=automount,OU=Maps,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk" \
  -a objectclassMap=automount:automountMap=nisMap \
  -a objectclassMap=automount:automount=nisObject \
  -a objectclassMap=auto.home-linux:automount=nisObject \
  -a objectclassMap=auto.linux:automount=nisObject \
  -a attributeMap=automount:automountMapName=nisMapName \
  -a attributeMap=automount:automountKey=cn \
  -a attributeMap=automount:automountInformation=nisMapEntry \
  -a attributeMap=auto.home-linux:automountMapName=nisMapName \
  -a attributeMap=auto.home-linux:automountKey=cn \
  -a attributeMap=auto.home-linux:automountInformation=nisMapEntry \
  -a attributeMap=auto.linux:automountMapName=nisMapName \
  -a attributeMap=auto.linux:automountKey=cn \
  -a attributeMap=auto.linux:automountInformation=nisMapEntry

One caveat to note is that we had to map each of the top-level named maps. One might think that the lines:

  -a attributeMap=automount:automountMapName=nisMapName \
  -a attributeMap=automount:automountKey=cn \
  -a attributeMap=automount:automountInformation=nisMapEntry \

would inherit, but apparently not!

So all that’s left to do is restart autofs:

svcadm enable autofs

Just as a side note on the format of maps we use for autofs, I’ve mentioned they are stored in our Active Directory. We’ve created a number of “nisMap” objects, for example, the object at “cn=auto.linux,OU=automount,OU=Maps,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk” is a nisMap object (probably created using ADSI edit, but I think there’s a tab available if you install the right roles on the server).

The nisMap object then contains a number of nisObject objects. e.g.:

CN=bin,CN=auto.linux,OU=automount,OU=Maps,DC=socs-ad,DC=cs,DC=bham,DC=ac,DC=uk
nisMapName -> bin
nisMapEntry -> -rw,suid,hard,intr jaffle:/vol/vol1/bham.linux/bin

For the muilti-mount map entry, each mount point is just space separated in the nisMapEntry, e.g.:

CN=htdocs,CN=auto.linux,OU=automount,OU=Maps,DC=socs-ad,DC=cs,DC=bham,DC=ac,DC=uk
nisMapName -> htdocs
nisMapEntry -> /events -rw,hard,intr jaffle:/vol/vol1/htdocs/web-events /hci -rw,hard,intr jaffle:/vol/vol1/htdocs/web-hci ...

(and yes, if you’re reading in the feed, the parts did get mixed up … I wrote part 4 before 3!)

part 1 – Hardware and Basic information
part 2 – Base Operating System Installation
part 3 – Configuring Active Directory authentication integration
part 4 – Configuring the storage pool and auto snapshots
part 5 – NFS Server config & In-kernel CIFS

Configuring the Storage Pools – Building an OpenIndiana based ZFS File Server – part 4

ZFS offers a number of different types of RAID configuration. raid-z1 is basically like traditional raid5 with a single parity disk, raid-z2 has two parity disks and there’s also raid-z3 with three parity disks. Actually, parity disk isn’t strictly correct as the parity is distributed across the disks in the set with three copies of parity available.

Using multiple parity sets is important, particularly when using high capacity disks. Its not necessarily just a case of losing multiple disks, but also taking account of what happens if you get block failures on a high capacity disk. In fact, on our NetApp filer head, we’ve been using RAID-DP for some years now. We’ve actually had a double disk failure occur in on a SATA shelf, but we didn’t lose any data and the hot spare span in quickly enough that we’d have had to lose another 2 disks for us to actually lose any data.

For this file-server we’ve got a 16 disk JBOD array, and we had a few discussions about how to configure the array, should we have hot-spares … looking around, people seem to always suggest having a global hot-spare available. However with raid-z3, we’ve decided to challenge that view.number of

We were taking for granted running at least raid-z2 with dual parity and a hot spare, however with raid-z3, we don’t believe we need a hot-spare. Using all the disks in raid-z3, we’re not burning any more disks than we would have done with raid-z2 + hot spare, and the extra parity set effectively acts as if we’d got a hot-spare, however its already silvered and spinning with the data available. So in that sense, we’d be able to lose 3 disks and still not have lost any data, some performance yes, but not actually any data until we’d lost a fourth disk. And that’s the same if we had raid-z2 + hot spare, except we’d have a risk factor whilst the hot spare was silvering …

Of course, you need to take a look at the value of your data and consider the options yourself! There’s an Oracle paper on RAID strategy for high capacity drives.

Configuring the Pool

All the examples of building a pool use the name tank for the pool name. So I’m going to hang with tradition and call ours tank as well. First off, we need to work out what disks we want to add to the pool. format is our friend here:

root@bigstore:~# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
       0. c2t10d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@a,0
       1. c2t11d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@b,0
       2. c2t12d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@c,0
       3. c2t13d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@d,0
       4. c2t14d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@e,0
       5. c2t15d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@f,0
       6. c2t16d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@10,0
       7. c2t17d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@11,0
       8. c2t18d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@12,0
       9. c2t19d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@13,0
      10. c2t20d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@14,0
      11. c2t21d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@15,0
      12. c2t22d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@16,0
      13. c2t23d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@17,0
      14. c2t24d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@18,0
      15. c2t25d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@19,0
      16. c3t0d0 <ATA-INTELSSDSA2BT04-0362 cyl 6228 alt 2 hd 224 sec 56>
          /pci@0,0/pci8086,3a40@1c/pci8086,3505@0/sd@0,0
      17. c3t1d0 <ATA-INTELSSDSA2BT04-0362 cyl 6228 alt 2 hd 224 sec 56>
          /pci@0,0/pci8086,3a40@1c/pci8086,3505@0/sd@1,0
      18. c3t2d0 <ATA-INTEL SSDSA2BT04-0362-37.27GB>
          /pci@0,0/pci8086,3a40@1c/pci8086,3505@0/sd@2,0
      19. c3t3d0 <ATA-INTEL SSDSA2BT04-0362-37.27GB>
          /pci@0,0/pci8086,3a40@1c/pci8086,3505@0/sd@3,0
      20. c3t4d0 <ATA-INTEL SSDSA2BW16-0362-149.05GB>
          /pci@0,0/pci8086,3a40@1c/pci8086,3505@0/sd@4,0
      21. c3t5d0 <ATA-INTEL SSDSA2BW16-0362-149.05GB>
          /pci@0,0/pci8086,3a40@1c/pci8086,3505@0/sd@5,0
Specify disk (enter its number):

Disks 0-15 are the SATA drives in the JBOD (note they are truncated to 2TB disks at present, though they are actually 3TB – we’re awaiting a new controller – as I mentioned in part 1, the LSI 1604e chip on the external SAS module truncates the disks. When the new controller arrives, we’ll have to re-create the pool, but 2TB is fine for our initial configuration and testing!).

Now lets create the pool with all the disks using raidz3:

zpool create tank raidz3 c2t10d0 c2t11d0 c2t12d0 c2t13d0 c2t14d0 c2t15d0 c2t16d0 c2t17d0 c2t18d0 c2t19d0 c2t20d0 c2t21d0 c2t22d0 c2t23d0 c2t24d0 c2t25d0

To act as a read-cache, we also have 2x 160GB Intel SSDs, so we need to add them as L2ARC cache into the array:

zpool add tank cache c3t4d0 c3t5d0

You can’t mirror L2ARC, though I did read some interesting thoughts on doing so. – I don’t think its necessary in our case, however I could see how suddenly losing the cache in some environments might have a massive performance impact that could become mission critical.

We’ve also got a couple of SSDs in the system to act as the ZFS intent log (ZIL), so we’ll add them to the array:

zpool add tank log mirror c3t2d0 c3t3d0

Note that they are mirrored. The ZIL is used to speed up write by writing the data through SSD before going back to disk. Its used as a last chance should a write to spinning disk not have been completed in the event of a system event (e.g. power failure). With high-speed drives, its perhaps not necessary, but under heavy lead will give write performance. Given its important its consistent, we mirror it and its purpose is to allow a write to be acknowledged to a client even it its not been fully committed to spinning disk.

Ideally the drives should be SLC rather than MLC, but there’s a cost trade-off to be had there.

De-duplication and compression

ZFS allows de-duplication and compression to be enabled at pool or dataset/file-system level. We’ll enable both at the pool and it will inherit from there:

zfs set dedup=on tank
zfs set compression=on tank
zfs get compression tank
zfs get compressratio tank

The latter command will show the compression ratio and will vary over time depending on what files are in the tank.

When thinking about de-duplication, its important to consider how much storage you have and how much RAM is available. This is because you want the de-duplication hash tables to be in memory as much as possible for write speed. We’ve only got 12GB RAM in our server, but from what I’ve read, the  L2ARC SSDs should also be able to hold some of the tables and pick up some of the workload there.

When setting up de-duplication, we had some discussions about hashing and how it might work in relation to hash clashes … Bonwick’s blog has a lot of answers on this topic!

Just to touch briefly on quotas when using compression and de-duplication. This isn’t something that I’ve seen an answer on, and I don’t have time to look at the source code … but, here’s my supposition. Quota’s do take account of the compressed files. i.e. the account for the blocks used on disk rather than the size of the file. Quota’s don’t take account of any de-duplicated data. I’m pretty sure this must be the case, otherwise the first block owner would be penalised, and the final block owner could suddenly go massively over quota if other copies were deleted.

And a quick note on compression … this is something which I was mildly bemused by on first look:

milo 31% du -sh /data/private/downloads/
4.1G	/data/private/downloads/
milo 32% date && cp -Rp /data/private/downloads/* . && date
Fri May 25 15:25:39 BST 2012
Fri May 25 15:28:29 BST 2012

milo 34% du -sh .
3.0G	.

So by the time the files were copied to the NFS mounted compressed ZFS file-system, they’d shrunk by 1.1G. Now thinking about this, I should have realised that du was showing the size of blocks used, rather than the file-sizes, but at first glance, I was a little bemused!

Automatic snapshots

We’d really like to have automatic snapshots available for our file-systems, and time-slider can provide this. The only real problem is that time-slider is a GUI application and on a file-server, this isn’t ideal. Anyway, I found its possible to configure auto-snapshots from the terminal, first off install time-slider:

 pkg install time-slider
 svcadm restart dbus
 svcadm enable time-slider

Note, if you don’t restart dbus after installing time-slider, you’ll get python errors out of the time-slider service like:

dbus.exceptions.DBusException: org.freedesktop.DBus.Error.AccessDenied: Connection ":1.3" is not allowed to own the service "org.opensolaris.TimeSlider" due to security policies in the configuration file

To configure, you need to set the hidden ZFS property on the pool and then enable the auto-snapshots that you require:

zfs com.sun:auto-snapshot=true tank
svcadm enable auto-snapshot:hourly
svcadm enable auto-snapshot:weekly
svcadm enable auto-snapshot:daily

This blog gives more details on configuring, in short, you need to export the auto-snapshot manifest, edit it and then re-import it.

However, you should also be able to manipulate the service using svccfg, for example:

svccfg -s auto-snapshot:daily setprop zfs/keep= astring: '7'

Would change the keep period for daily snapshots to 7, the properties that are interesting are (depending if you are changing monthly, daily, hourly or weekly:

svccfg -s auto-snapshot:daily listprop zfs
zfs           application
zfs/interval  astring  days
zfs/period    astring  1
zfs/keep      astring  6

To list your snapshots, use the following:

root@bigstore:~# zfs list -t snapshot
NAME                                                  USED  AVAIL  REFER  MOUNTPOINT
rpool/ROOT/openindiana-1@install                     17.7M      -  1.52G  -
rpool/ROOT/openindiana-1@2012-05-18-14:54:49          317M      -  1.82G  -
rpool/ROOT/openindiana-1@2012-05-24-10:58:07         65.6M      -  2.32G  -
tank@zfs-auto-snap_weekly-2012-05-24-12h14               0      -  90.6K  -
tank@zfs-auto-snap_daily-2012-05-27-12h14                0      -  90.6K  -
tank@zfs-auto-snap_hourly-2012-05-28-11h14               0      -  90.6K  -
tank/research@zfs-auto-snap_weekly-2012-05-24-12h14  55.0K      -  87.3K  -
tank/research@zfs-auto-snap_hourly-2012-05-24-13h14  55.0K      -  87.3K  -
tank/research@zfs-auto-snap_hourly-2012-05-25-09h14  64.7K      -   113K  -
tank/research@zfs-auto-snap_hourly-2012-05-25-11h14  55.0K      -   116K  -
tank/research@zfs-auto-snap_daily-2012-05-25-12h14   71.2K      -  25.2M  -
tank/research@zfs-auto-snap_hourly-2012-05-25-16h14   162K      -  3.00G  -
tank/research@zfs-auto-snap_daily-2012-05-26-12h14   58.2K      -  3.00G  -
tank/research@zfs-auto-snap_hourly-2012-05-27-04h14  58.2K      -  3.00G  -
tank/research@zfs-auto-snap_daily-2012-05-27-12h14       0      -  3.00G  -
tank/research@zfs-auto-snap_hourly-2012-05-28-11h14      0      -  3.00G  -

Just as a caveat on auto-snapshots, the above output is a system that has been running weekly, hourly and daily snapshots for several days. Note that the hourly snapshots seem a bit sporadic in their availability. This is because if the snapshot the intermediate (hourly or daily) snapshots are 0K in size, then they get removed. i.e. snapshots will only be listed where there’s actually a change in the data. This seems quite sensible really…

Finally, we want to make the snapshots visible to NFS users:

zfs set snapdir=visible tank/research

For Windows users connecting via the in-kernel CIFS server, the previous versions tab should work.

part 1 – Hardware and Basic information
part 2 – Base Operating System Installation
part 3 – Configuring Active Directory authentication integration
part 4 – Configuring the storage pool and auto snapshots
part 5 – NFS Server config & In-kernel CIFS

Building an OpenIndiana based ZFS File Server – part 2

OpenIndiana Installation

Getting OpenIndiana onto our file-server hardware was a pretty simple affair, download the memory stick image and dd it onto a fresh memory stick and install … actually we did struggle with this to start with – we were planning on testing it on an old PC in my office (an 2007 generation 965 based Core2 Duo PC – can’t believe these are machines coming out of desktop service!)

Basically follow the text based installer – its quite reminiscent of pre-Jumpstart installs! We didn’t enable any additional users as we were planning on integrating with our Active Directory for authentication. If you do, then the installer will disable root logins by default.

The installer creates a default ZFS rpool for the root file-system, as we want that to be a mirror, we needed to add a second device once it had booted. This blog entry has some instructions on this. As its been a while since I’ve done Solaris, I’d actually approached it via a different manner (used format and fdisk to manipulate the disk partitions by hand, but prtvtoc | fmthard is of course the way we used to do it with Solstace DiskSuite, and is quicker and easier.

If we hadn’t already installed Linux onto the disk to test initially, we probably wouldn’t have got the error

cannot label 'c3t1d0': EFI labeled devices are not supported
    on root pools

when trying to add the disk directly when running

zpool attach rpool c3t0d0s0 c3t1d0s0

Clearing and reinitialising the partition table resolved that problem though. Don’t forget to use the installgrub command on the second disk to install the boot loader onto the disk, otherwise you won’t actually be able to boot off it in the event of a primary disk failure!

installgrub -m /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c3t1d0s0

Ensure that the disks have finished resilvering before you reboot the system! (zpool status will tell you!).

The resilver process in ZFS is different to those familiar with mirrors in DiskSuite or linux software RAID as it doesn’t create an exact block for block mirror of the disks, it only copies data blocks, so the data layout on the disks is likely to be different. It also means that if you have a 100GB drive with only 8GB data, you only need to mirror the 8GB of data, not all the empty file-system blocks as well …

Updating the OS

Since OpenIndiana 151a was released, there’s a bunch of updates been made available, so you want to upgrade to the latest image. First off, update the pkg package:

pkg install package/pkg

You can then do a trial run image update:

pkg image-update -nv

Run it again without the “n” flag to actually do the update. This will create a new boot environment, you can list using the command:

root@bigstore:~# beadm list
BE            Active Mountpoint Space Policy Created
openindiana   -      -          8.17M static 2012-05-18 15:11
openindiana-1 NR     /          3.95G static 2012-05-18 15:54
openindiana-2 -      -          98.0K static 2012-05-24 11:58

That’s it, you’re installed and all up to date!

In addition to the base Operating System repositories, we also add some extra repos:

pkg set-publisher -p http://pkg.openindiana.org/sfe
pkg set-publisher -p http://pkg.openindiana.org/sfe-encumbered

Enabling Serial Console

All of our servers are connected to Cyclades serial console servers, this lets us connect to the serial ports via ssh from remote locations, which is great when bits of the system go down. To enable to serial console, you need to edit the grub config file. As we have ZFS pool zpool, its located at /rpool/boot/grub/menu.lst. You need to comment out the splashimage line if present (as you can’t display the XPM down a serial port!), then add a couple more lines to enable serial access for grub.

 ###splashimage /boot/grub/splash.xpm.gz
 serial --unit=0 --speed=9600
 terminal --timeout=5 serial

You also need to find the kernel line and append some data to that:

-B console=ttya,ttya-mode="9600,8,n,1,-"

As there was already a -B flag in use, our resulting kernel line looked like this:

kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=ttya,ttya-mode="9600,8,n,1,-"

You might want to tweak the line-speed, parity etc for your environment.

Static IP address

For our servers, we prefer to use static IP addressing rather than have then DHCP. As we selected auto-configure during the installer, we need to swap to static IP address. Note that if you don’t have access (serial console or real console, you’ll likely get disconnected)!

First off, we need to disable the DHCP client. OpenIndiana dhcpclient is part of the nwam service:

svcadm disable network/physical:nwam

Boom! Down goes the network connection. So make sure you have access via another method! I’ve been adminning Solaris for a long time (since 2001), so I’m going to configure networking in the traditional (pre Solaris 10) manner … with some files, but you can do this step with ipadm command.

/etc/hostname.ixgbe0
    bigstore
/etc/hosts
    147.188.203.45 bigstore bigstore.cs.bham.ac.uk bigstore.local
    ::1 bigstore bigstore.local localhost loghost
/etc/defaultrouter
    147.188.203.6
/etc/netmasks
    147.188.0.0     255.255.255.0
/etc/resolv.conf
    domain  cs.bham.ac.uk
    search cs.bham.ac.uk
    nameserver  147.188.192.4
    nameserver  147.188.192.8
cp /etc/nsswitch.dns /etc/nsswitch.conf

Finally we need to enable the static IP configuration service:

svcadm enable network/physical:default

part 1 – Hardware and Basic information
part 2 – Base Operating System Installation
part 3 – Configuring Active Directory authentication integration
part 4 – Configuring the storage pool and auto snapshots
part 5 – NFS Server config & In-kernel CIFS