Intel AMT: Part 2 – Remote Configuration Service (RCS)

Remote Configuration Service (RCS)

Following on from the overview of Intel AMT these are a transcript of the notes I took when configuring our RCS server…

RCS can be used in either database mode or non-database mode. Database mode gives access to a greater array of AMT functions, however it requires either:

  • Microsoft SQL Server 2008 Enterprise (x32/x64)
  • Microsoft SQL server 2008 R2 Enterprise (x64)
  • Microsoft SQL server 2005 Enterprise (x32)

Our provisioning server is currently Microsoft SQL Server 2005 Standard, which RCS will apparently not work with so, sadly, we have to use non-database mode for the time being.

Installing RCS

If you are upgrading a previous version of RCS:

  • Back up the existing configuration data, the included migration tool may assist but in our case couldn’t find any of the configuration data!
  • Back up VBScript: D:\RCS_Scripts\ConfigAMT.bat and D:\RCS_Scripts\ConfigAMT.vbs

Follow RCS installation instructions!

You probably don’t want to be running RCS as the domain administrator:

  • Create a domain user (RCSuser) to run RCS as
  • Assign Full control to the OU in the Active Directory to be used by RCS to hold AMT computer objects

Logs can be found in:

  • C:\ProgramData\Intel_Corporation\RCSConfServer
  • D:\RCS_Scripts (Location of ConfigAMT.vbs script)

The user running the RCS server requires the following permissions:

  • Issue and Manage Certificates
  • Request Certificates
  • Read and Enroll (for Enterprise CA’s only)
  • In the CA: right-click the CA and select Properties
    • Policy Module->Properties
      • Ensure Follow settings in certificate template is selected
    • Security
      • Add RCSuser with Issue and Manage Certificates and Request Certificates

You now need to create the certificate for the server and the template for the AMT clients.

Client Certificates

Certificate Template

To create the client certificate template:

  • Run the Certificate Templates MMC snap-in
  • Select User Template from right-pane
  • Duplicate template, important you must select Windows Server 2003 Enterprise
    • Enter template name
    • Enter validity period (10 years?)
    • Select publish certificate in Active Directory
    • Request Handling->CSPs
      • Select Microsoft Strong Cryptographic Provider
    • Subject Nametab
      • Select Supply in the request
    • Securitytab
      • Add RCSuser with Read and Enroll permissions
    • Extensionstab
      • Applications Policies: Server Authentication
  • Template properties->Issuance Requirements
    • Ensure CA certificate manager approval is not selected

Adding the template:

  • Run the Certificate Authority MMM snap-in
  • Certificate Templates->New->Certificate Template to issue
  • Select the created template
  • Restart the CA

Assigning the template

The certificate template is then used in the TLS Server Certificate Template settings in the AMT profile. The Default CNs setting can be used or changed as necessary.

Server Certificate

Create Certificate Template

To create the server certificate template:

  • In Certificate Template MMC snap-in
  • Duplicate the Computer Template
  • Select Windows Server 2003 Enterprise
    • Enter name
    • Enter Validity period: 10 years?
    • Extensionstab
      • Application Policies: Server Authentication and AMT Provisioning
    • Subject nametab
      • Select Supply in the request
    • Request Handlingtab
      • Select Allow private key to be exported

Adding the template:

  • Run the Certificate Authority MMM snap-in
  • Certificate Templates->New->Certificate Template to issue
  • Select the created template
  • Restart the CA

Create the Certificate

The Intel AMT documentation uses IE to create a certificate from the url http://ca-srv/certsrv, but we don’t have this virtual directory in IIS on our CA, so we have to do it via mmc:

  • Use the Certficates MMC snap-in and connect as Computer account:
  • In the Personal certificate store Request new certificate
  • Select AMT RCS Server template
  • Set Subject name to Common name => FQDN of server
  • Select Enroll
  • Export the certificate to .pfx including the private key

Validity Period

If after creating the certificate you only get a two year validity period, the validity period of the certificate will be the lowest of:

1. The lifetime remaining for the issuing CA’s certificate.

2. The value in the certificate template (not applicable in your case).

3. Registry entries

To view the registry settings:

certutil -getreg ca\validityperiod
certutil -getreg ca\validityperiodunits

To change the settings – must restart CA after the change:

certutil -setreg ca\validityperiodunits 10

Install the Certificate

Install the certificate in the RCSusers personal certificate store:

  • Run mmc as the RCSuser
  • Add the Certificates snap-in
  • Select Personal certificate store
  • Import the .pfx file

Restart the RCSServer

Configuring Client PKI

Setup.bin versions

You need to create the correct version of the Setup.bin file otherwise AMT will ignore it. It’s been hard to find a definitive list of which version of the file goes with which version of AMT, but from our machines it appears to be the following:

  • V1 – AMT prior to 3.0
  • V2 – AMT 3.0 or higher
  • V3 – AMT 6.0
  • V4 – AMT 7.0

To avoid loaded certificate hashes manually into the Intel ME BIOS, a USB memory stick can be used. Create a setup.bin file via USBFile.exe which is part of the Intel AMT SDK:

USBFile.exe -create setup.bin admin <new-password> -consume 0 -amt 
            -kvm 1 -oHash 1 -oHash 0 -hash cca-ca.pem CCA-CA -prov 1

Copy setup.bin to a USB memory stick and boot the AMT client from it.

Configuring Server

RCS needs to be configured to respond to Hello

  • Configure Tools->Support configuration trigger by Hello messages
    • Enable
    • Point at ConfigAMT.bat
    • Add RCSuser permissions for read and write to folder and contents

Note: The RCSuser must have write permission to the folder containing the scripts

Restart the RCSServer

AMT Profiles

The profile name is hard coded into ConfigAMT.vbs so ensure you create the profile with the same name!

  • Optional Settings

    • Access Control List (ACL)
    • Transport Layer Security (TLS)
  • AD Integration
    • Active Directory OU: OU=Out of Band Management Controllers,….
  • Access Control List
    • Appropriate AD groups
  • Transport Layer Security
    • Certificate Authority: <your AD CCA>
    • Server Certificate Template: AMTTemplate
    • Common Names (CNs) in certificate: Default CNs
  • System Settings
    • Web UI, Serial Over Lan, IDE Redirection, KVM Redirection
    • KVM Settings: User consent required
    • System power states: Always On
    • ME Bios Extension passwd: <AMT password>
    • Use following password for all systems: <AMT Password>
    • Select Enable Intel AMT to respond to ping requests
    • Disable Fast Call
    • IP and FQDN settings:
      • Use Primary DNS FQDN
      • Device and OS will have same FQDN
      • Get IP from DHCP server
      • Do not update DNS

Pre-shared Key Provisioning

Older versions of AMT (e.g. v2.0) don’t support Enterprise PKI mode and need to be provisioned via PSK instead.

Plan A:

Use ACUConfig.exe to create the pre-shared key:

ACUConfig.exe CreatePSK <RCS Address> /NewMEBxPass <new password> 
              /CurrentMEBxPass <current password> /UsingDHCP

Plan B:

(Note: ACUConfig.exe didn’t work….trying USBFile.exe next…)

USBFile.exe -create psk.bin <oldPassword> <newPassword> -v 1 -dhcp 1
            -ztc 0 -rpsk -consume 0 -psadd <RCS address> -pspo 9971

Import the keys into the RCS Server:

  • Start the SCS Console
  • Tools->Import PSK Keys from File…
  • Select above psk.bin file

Intel AMT

Background

Having managed to get Intel AMT up and running a year or so ago, we found we’d pretty much forgotten how to configure it in Enterprise mode. Since our new machines were coming with AMT 7 we thought we revisit the technology. Having found a lot of people struggling to get AMT up and running we thought we’d share our findings in case it proved useful.

Overview of Intel AMT

Intel Active Management Technology (Intel AMT) is part of vPro and essentially allows out-of-band monitoring and management of hardware even when the machine is powered down. See the appropriate Intel AMT and Intel Setup & Configuration Software (SCS) documentation for more detailed information of the full capabilities.

Configuration Modes

There are several methods for configuring and operating Intel AMT:

  • Manual Mode – The configuration is carried out locally via the Intel ME section of the BIOS.
  • Host Based – The configuration is carried out locally on the machine under Windows by running the installed configuration tool.
  • Small Medium Business (SMB) Mode – The configuration is carried out via a USB memory stick containing setup.bin created via the configuration tool.
  • Enterprise Mode using PKI – This is uses the Remote Configuration Service (RCS) component of the Intel Setup & Configuration Software (SCS).

Manual

This the easiest way to get AMT up and running. Simply configure the AMT settings via Local Provisioning in the Intel ME section of the BIOS.

  • Configure hostname
  • Set network to DHCP/static
  • You may need to turn on Legacy SMB Support (See this article)

 Host Based

This mode requires Windows, run Intel SCS Configuration Tool and select Configure/Unconfigure this system.

Small Medium Business (SMB) Mode

Use the Intel SCS Configuration Tool to created a USB memory stick with setup.bin, then boot the machines off the memory stick.

Enterprise Mode using PKI

Enterprise mode allows automatic remote provisioning via the Remote Configuration Service (RCS) provisioning server, provided as part of the Intel SCS (Setup and Configuration Software). This mode is a little fiddly to get going but the real reason to use this over the previous methods is that AMT traffic use SSL rather than normal http, hence the encryption protects authentication credentials and AMT payload traffic.

This mode ideally requires write access to the Active Directory in order to create machine objects for the AMT host and access to a Microsoft Certificate Authority to create certificates for each AMT host.

The RCS server needs to be configured with a signed certificate, either from a commercial CA (a number of whose certificate hashes are in the Intel ME BIOS by default), or from a self signed CA. Ours is generated from the CA on SMS-1, this allows the clients to verify the authenticity of the server.

If you opt to use self signed certificates, the hash of the CA certificate needs to be loaded into the User Certificate section of the Intel ME BIOS. This can be done by hand but is time consuming especially if you need to do more than a couple of machines. The best way to quickly configure a large number of machines is to create a USB memory stick with the appropriate certificate hash via the USBFile.exe from the Intel AMT SDK.

A user configured script can also be assigned to the server in order determine the appropriate AMT configuration profile to assign to a host when it requests configuration. Ours essentially looks up the UUID in the Active Directory in order to determine the hostname. This requires machines to exist in the Active Directory before hand, but since all our hosts are either pre-staged or created via WDS this isn’t a problem for us.

A basic overview of the process follows below:

  • The AMT client sends Hello packets to provisionserver.<domain>
  • Various parameters, (UUID, IP address etc) from the calling host are passed the to configuration script
  • The script looks up the host from these parameters and determines the profile to apply
  • The script performs a WMI call to ConfigAMT(….)
  • The RCS server creates an AMT computer object in a specified Active Directory OU
  • The Certificate Authority issues a signed certificate for the client
  • The AMT configuration profile is sent to the client

See Part 2 for details on configuring the Remove Configuration Service.

Using AMT

Via Web Browser

For manual or SMB mode:

http://<fqdn>:16992/

For Enterprise mode:

https://<fqdn>:16993/

Via Linux Tools

  • amttool
  • amtterm
  • gamt

You need to set AMT_PASSWORD environment variable to the AMT admin password.

For Enterprise mode:

amttool <fqdn>:16993 <command>

Via RealVNC viewer

RealVNC viewer plus has support for accessing the AMT onboard KVM for remote diagnosis.

Proxying to GlassFish

We’re currently developing a new course work submission system and the developer is building it in Java running on GlassFish, we generally don’t like to expose individual servers out to the world, but proxy URLs using Apache httpd’s mod_proxy, so for example https://www.cs.bham.ac.uk/internal/students/submission will be the URL used, but that’s actually running on some back-end GlassFish servers.

When testing uploads of files (generally large, but possibly over slow connections), we found that we were getting the following in the httpd error_log:

proxy: pass request body failed to

And the client would error with something like:

net::ERR_CONNECTION_RESET

Eventually after much Googling with similar reports, I came upon the Apache httpd directive we need to tweak:

RequestReadTimeout body=10,MinRate=1000

Basically what we are saying here is that the maximum time for request is 10 seconds, but increase it by 1 second per 1000 bytes of data received.

Musings on a Raspberry Pi

Well after ordering one back in April it finally arrived on Saturday. After waiting 10 weeks I’d nearly forgotten why I’d wanted one in the first place! However, my enthusiasm to play with it hadn’t waned and with much gusto I set about connecting my collection on accessories I’d amassed over my wait for its arrival.

Having faced compatibility issues with random devices at work over the years, I decided to buy from the list of compatible peripherals to avoid any potential issues. Here’s my list:

  • Logitech MK260 Keyboard
  • Duronic 1000mA USB Adapter
  • Micro USB Cable
  • 8GB SDHC Card
  • HDMI Cable
  • Tenda Wireless-N150 USB Adapter
  • New Link 4 Port USB Powered Hub
  • Case from ModMyPi

I’d already written Debian (debian6-19-04-2012) to the SD card in anticipation of the R-Pi arriving, so I was all set to power the beast up.

Things to note:

  1. The USB Hub will back feed the R-Pi – This probably isn’t a good thing so I power the board first then connect the hub.
  2. The Tenda USB wireless adapter needs:
    • apt-get install firmware-ralink
  3. After starting X the display started to randomly blank. Starting scratch caused it to blank totally! This may have been caused by the length (2m) of my HDMI cable. Adding the following to /boot/config.txt seemed to fix the problem
    • config_hdmi_boost = 2
  4. By default sound is disabled as the drivers are still experimental:
    • apt-get install alsa-utils
    • modprobe snd_bcm2835
    • Add snd_bcm2835 to /etc/modules to load the module on boot
  5. I get random key repeats or missing characters sometimes. This maybe due to
    1. Not enough power to the USB receiver, although from the powered hub should be ok.
    2. Interference between wireless and keyboard adapter.

Need to do further experimentation with the keyboard issue, unfortunately I don’t have a wired USB keyboard to try, but comments by various people on forums seems to indicated that the wired keyboard will most likely work fine.

 

Configuring Active Directory authentication integration – Building an OpenIndiana based ZFS File Server – part 3

Getting Kerberos based authentication working with Active Directory is actually pretty simple, there’s numerous blogs out there on the topic, here, here, here and here, so I’m probably mostly covering old ground on the basic integration stuff.

Our Active Directory already has Schema extentions to hold Unix account data, initially Services for Unix (SFU), but we added the Server 2008 schema which adds RFC 2307 attributes which we now use for Linux authentication.

First off, I should point out that we have a disjoint DNS namespace for our AD and normal client DNS. i.e. our AD is socs-ad.cs.bham.ac.uk, but our clients are all in cs.bham.ac.uk. This shouldn’t really cause any problems for most people (I’ve only come across 3 cases, 1: back in about 2003 where a NetApp filer couldn’t work out the DNS name as it didn’t match the netbios name … fixed a long time ago, 2: about 18 months ago when experimenting with SCCM and AMT provisioning – it doesn’t support disjoint DNS, 3: with the OI in-kernel CIFS server … in part 5!).

First off, we need to configure some config files:

/etc/resolv.conf
  domain  cs.bham.ac.uk
  search cs.bham.ac.uk socs-ad.cs.bham.ac.uk
  nameserver  147.188.192.4
  nameserver  147.188.192.8
cp /etc/inet/ntp.client /etc/inet/ntp.conf
/etc/inet/ntp.conf
  server ad1.cs.bham.ac.uk ad2.cs.bham.ac.uk timehost.cs.bham.ac.uk

The first two are our DCs, the latter a general NTP servers – all should be pretty much in step though!

Finally, enable ntp:

svcadm enable ntp

To help with Kerberos principle generation, I grabbed a copy of adjoin. Note that because we have a disjoint namespace, I had to hack it a little otherwise it tries to add the full Windows domain to the hostname in the SPNs:

###fqdn=${nodename}.$dom
fqdn=bigstore.cs.bham.ac.uk
./adjoin -f

Check you have a correct looking machine principle file

klist -e -k /etc/krb5/krb5.keytab

And enable a couple of services we’ll need:

svcadm enable /network/dns/client
svcadm enable /system/name-service-cache

We also need to configure pam.conf to use Kerberos, so you need to add a couple of lines similar to:

 other   auth required           pam_unix_cred.so.1
 other auth sufficient pam_krb5.so.1 debug
 other   auth required           pam_unix_auth.so.1
 other   account requisite       pam_roles.so.1
 other account required pam_krb5.so.1 debug nowarn
 other   account required        pam_unix_account.so.1
 other   password requisite      pam_authtok_check.so.1
 other password sufficient pam_krb5.so.1 debug
 other   password required       pam_authtok_store.so.1

The debug is optional to help with debugging why things aren’t working. The nowarn on the middle example is needed to stop a password expiry warning on each password based login – our AD passwords are set to never expire, but without this, it warns about expiry in 9244 days.

We now need to edit a file to be able to allow us to configure LDAP for passwd/group data, we want to remove all references to ldap except for passwd, group and automount:

/etc/nsswitch.ldap
  passwd:     files ldap
  group:      files ldap
  hosts:      files dns
  ipnodes:    files dns
  automount:  files ldap

Before we go any further, we also need to tweak the krb5 config:

/etc/krb5.conf
  [libdefaults]
    default_tkt_enctypes = rc4-hmac arcfour-hmac arcfour-hmac-md5
    default_tgs_enctypes = rc4-hmac arcfour-hmac arcfour-hmac-md5
    permitted_enctypes = rc4-hmac arcfour-hmac arcfour-hmac-md5

You might not need to do this, our DCs are quite old running Server 2003, without this, Kerberos authentication wouldn’t work.

We now need to configure LDAP:

 ldapclient -v manual \
 -a credentialLevel=self \
 -a authenticationMethod=sasl/gssapi \
 -a defaultSearchBase=dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk \
 -a defaultSearchScope=sub \
 -a domainName=socs-ad.cs.bham.ac.uk \
 -a defaultServerList="ad1.socs-ad.cs.bham.ac.uk ad2.socs-ad.cs.bham.ac.uk" \
 -a attributeMap=passwd:gecos=ad1.socs-ad.cs.bham.ac.uk \
 -a attributeMap=passwd:homedirectory=unixHomeDirectory \
 -a attributeMap=passwd:uid=sAMAccountName \
 -a attributeMap=group:uniqueMember=member \
 -a attributeMap=group:cn=sAMAccountName \
 -a objectClassMap=group:posixGroup=group \
 -a objectClassMap=passwd:posixAccount=user \
 -a objectClassMap=shadow:shadowAccount=user \
 -a serviceSearchDescriptor='passwd:ou=people,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk?sub' \
 -a serviceSearchDescriptor='shadow:ou=people,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk?sub?memberOf=CN=sysop,OU=Groups of People,OU=Groups,DC=socs-ad,DC=cs,DC=bham,DC=ac,DC=uk' \
 -a serviceSearchDescriptor='group:ou=groups,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk?sub?(&(objectClass=group)(gidNumber=*))'

You’d need to tweak it for your environment of course. Importantly, you need the bit

(&(objectClass=group)(gidNumber=*))

for the group serviceSearchDescriptor, otherwise you’ll get spurious results if you have groups with no gidNumber assigned. Ideally we’d also have similar filters for passwd and shadow, but that didn’t seem to work properly.

Restart the nscd daemon:

svcadm enable name-service-cache

You could try doing an LDAP search with something like:

ldapsearch -R -T -h ad1.socs-ad.cs.bham.ac.uk -o authzid= -o mech=gssapi -b dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk -s sub cn=jaffle

and also getent should now work

getent passwd
getent group

Restricting login access

So, we’ve managed to integrate our password data on the server – pretty much we need to have access to all our directory users for NFS to work properly so that usernames and UIDs match, however this means anyone can login to the server. There’s no equivalent of Linux’s pam_access, and there doesn’t appear to be any native way of specifying Unix groups of people who can login to the system. The closest I found was pam_list, however this only works with netgroups, and as we don’t use these for anything anymore, they were never migrated to our AD, and anyway, we’ve got perfectly good Unix groups of people to use on our other systems.

After running round in circles for a while, and almost creating netgroups, I came across a solution that seems to work nicely. Its a bit of a hack, but actually it quite a nice solution for us. The key is in the ldapclient definition for shadow:

-a serviceSearchDescriptor='shadow:ou=people,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk?sub?memberOf=CN=sysop,OU=Groups of People,OU=Groups,DC=socs-ad,DC=cs,DC=bham,DC=ac,DC=uk'

Note that we add an LDAP filter requiring membership of a specific LDAP group, this means that shadow data is only present for the members of that group and hey presto – we’ve got Unix group based authorisation working on OpenIndiana. If you wanted multiple groups, you’d have to tweak the filter with some parenthesis and a | probably …

And automount/autofs?

Autofs from LDAP was a little bit more complicated to get going, probably complicated a bit as we have Linux format/named autofs maps in our AD. Technically we don’t need this bit working for our file-server, but we thought for completeness, we’d investigate the options for it.

First off, we’ll edit a couple of files:

/etc/auto_home
  #+auto_home

/etc/auto_master
  #
  +auto_master
  /bham   +auto.linux
  #+auto.master

A long time ago, we used NIS for Solaris and Linux – back then the maps had to be discrete – Linux autofs didn’t have nested/multi-mount and Solaris didn’t support nested mount. e.g. under Solaris we could do:

/bham
    ... /foo
    ... /baa
    ... /otherdir
            .... /foo
            .... /bar

But under Linux, it had to be:

/bham
    ... /foo
    ... /baa

and in another Linux map:

/bham
    ... /otherdir
            .... /foo
            .... /bar

When the NIS maps got carried over the LDAP for Linux when we rolled out Scientific Linux 6, this got carried over as well. Now of course, this won’t work with Solaris and neither does it work with OpenIndiana.

After a bit of consideration, we thought we were going to have to build a separate set of maps for OI again. But then we found that Linux autofs 5 now supports multi-map, so we can use traditional Solaris format maps for nested directories in a single map. A quick test and an early morning edit to the maps, and we can now use the same maps under both OSes.

ldapclient mod \
  -a "serviceSearchDescriptor=auto_master:cn=auto.master,OU=automount,OU=Maps,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk" \
  -a "serviceSearchDescriptor=auto.home:cn=auto.home-linux,OU=automount,OU=Maps,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk" \
  -a "serviceSearchDescriptor=auto.linux:cn=auto.linux,OU=automount,OU=Maps,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk" \
  -a "serviceSearchDescriptor=auto.home-linux:cn=auto.home-linux,OU=automount,OU=Maps,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk" \
  -a objectclassMap=automount:automountMap=nisMap \
  -a objectclassMap=automount:automount=nisObject \
  -a objectclassMap=auto.home-linux:automount=nisObject \
  -a objectclassMap=auto.linux:automount=nisObject \
  -a attributeMap=automount:automountMapName=nisMapName \
  -a attributeMap=automount:automountKey=cn \
  -a attributeMap=automount:automountInformation=nisMapEntry \
  -a attributeMap=auto.home-linux:automountMapName=nisMapName \
  -a attributeMap=auto.home-linux:automountKey=cn \
  -a attributeMap=auto.home-linux:automountInformation=nisMapEntry \
  -a attributeMap=auto.linux:automountMapName=nisMapName \
  -a attributeMap=auto.linux:automountKey=cn \
  -a attributeMap=auto.linux:automountInformation=nisMapEntry

One caveat to note is that we had to map each of the top-level named maps. One might think that the lines:

  -a attributeMap=automount:automountMapName=nisMapName \
  -a attributeMap=automount:automountKey=cn \
  -a attributeMap=automount:automountInformation=nisMapEntry \

would inherit, but apparently not!

So all that’s left to do is restart autofs:

svcadm enable autofs

Just as a side note on the format of maps we use for autofs, I’ve mentioned they are stored in our Active Directory. We’ve created a number of “nisMap” objects, for example, the object at “cn=auto.linux,OU=automount,OU=Maps,dc=socs-ad,dc=cs,dc=bham,dc=ac,dc=uk” is a nisMap object (probably created using ADSI edit, but I think there’s a tab available if you install the right roles on the server).

The nisMap object then contains a number of nisObject objects. e.g.:

CN=bin,CN=auto.linux,OU=automount,OU=Maps,DC=socs-ad,DC=cs,DC=bham,DC=ac,DC=uk
nisMapName -> bin
nisMapEntry -> -rw,suid,hard,intr jaffle:/vol/vol1/bham.linux/bin

For the muilti-mount map entry, each mount point is just space separated in the nisMapEntry, e.g.:

CN=htdocs,CN=auto.linux,OU=automount,OU=Maps,DC=socs-ad,DC=cs,DC=bham,DC=ac,DC=uk
nisMapName -> htdocs
nisMapEntry -> /events -rw,hard,intr jaffle:/vol/vol1/htdocs/web-events /hci -rw,hard,intr jaffle:/vol/vol1/htdocs/web-hci ...

(and yes, if you’re reading in the feed, the parts did get mixed up … I wrote part 4 before 3!)

part 1 – Hardware and Basic information
part 2 – Base Operating System Installation
part 3 – Configuring Active Directory authentication integration
part 4 – Configuring the storage pool and auto snapshots
part 5 – NFS Server config & In-kernel CIFS

Configuring the Storage Pools – Building an OpenIndiana based ZFS File Server – part 4

ZFS offers a number of different types of RAID configuration. raid-z1 is basically like traditional raid5 with a single parity disk, raid-z2 has two parity disks and there’s also raid-z3 with three parity disks. Actually, parity disk isn’t strictly correct as the parity is distributed across the disks in the set with three copies of parity available.

Using multiple parity sets is important, particularly when using high capacity disks. Its not necessarily just a case of losing multiple disks, but also taking account of what happens if you get block failures on a high capacity disk. In fact, on our NetApp filer head, we’ve been using RAID-DP for some years now. We’ve actually had a double disk failure occur in on a SATA shelf, but we didn’t lose any data and the hot spare span in quickly enough that we’d have had to lose another 2 disks for us to actually lose any data.

For this file-server we’ve got a 16 disk JBOD array, and we had a few discussions about how to configure the array, should we have hot-spares … looking around, people seem to always suggest having a global hot-spare available. However with raid-z3, we’ve decided to challenge that view.number of

We were taking for granted running at least raid-z2 with dual parity and a hot spare, however with raid-z3, we don’t believe we need a hot-spare. Using all the disks in raid-z3, we’re not burning any more disks than we would have done with raid-z2 + hot spare, and the extra parity set effectively acts as if we’d got a hot-spare, however its already silvered and spinning with the data available. So in that sense, we’d be able to lose 3 disks and still not have lost any data, some performance yes, but not actually any data until we’d lost a fourth disk. And that’s the same if we had raid-z2 + hot spare, except we’d have a risk factor whilst the hot spare was silvering …

Of course, you need to take a look at the value of your data and consider the options yourself! There’s an Oracle paper on RAID strategy for high capacity drives.

Configuring the Pool

All the examples of building a pool use the name tank for the pool name. So I’m going to hang with tradition and call ours tank as well. First off, we need to work out what disks we want to add to the pool. format is our friend here:

root@bigstore:~# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
       0. c2t10d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@a,0
       1. c2t11d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@b,0
       2. c2t12d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@c,0
       3. c2t13d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@d,0
       4. c2t14d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@e,0
       5. c2t15d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@f,0
       6. c2t16d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@10,0
       7. c2t17d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@11,0
       8. c2t18d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@12,0
       9. c2t19d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@13,0
      10. c2t20d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@14,0
      11. c2t21d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@15,0
      12. c2t22d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@16,0
      13. c2t23d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@17,0
      14. c2t24d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@18,0
      15. c2t25d0 <ATA-Hitachi HUA72303-A5C0-2.00TB>
          /pci@0,0/pci8086,3410@9/pci8086,346c@0/sd@19,0
      16. c3t0d0 <ATA-INTELSSDSA2BT04-0362 cyl 6228 alt 2 hd 224 sec 56>
          /pci@0,0/pci8086,3a40@1c/pci8086,3505@0/sd@0,0
      17. c3t1d0 <ATA-INTELSSDSA2BT04-0362 cyl 6228 alt 2 hd 224 sec 56>
          /pci@0,0/pci8086,3a40@1c/pci8086,3505@0/sd@1,0
      18. c3t2d0 <ATA-INTEL SSDSA2BT04-0362-37.27GB>
          /pci@0,0/pci8086,3a40@1c/pci8086,3505@0/sd@2,0
      19. c3t3d0 <ATA-INTEL SSDSA2BT04-0362-37.27GB>
          /pci@0,0/pci8086,3a40@1c/pci8086,3505@0/sd@3,0
      20. c3t4d0 <ATA-INTEL SSDSA2BW16-0362-149.05GB>
          /pci@0,0/pci8086,3a40@1c/pci8086,3505@0/sd@4,0
      21. c3t5d0 <ATA-INTEL SSDSA2BW16-0362-149.05GB>
          /pci@0,0/pci8086,3a40@1c/pci8086,3505@0/sd@5,0
Specify disk (enter its number):

Disks 0-15 are the SATA drives in the JBOD (note they are truncated to 2TB disks at present, though they are actually 3TB – we’re awaiting a new controller – as I mentioned in part 1, the LSI 1604e chip on the external SAS module truncates the disks. When the new controller arrives, we’ll have to re-create the pool, but 2TB is fine for our initial configuration and testing!).

Now lets create the pool with all the disks using raidz3:

zpool create tank raidz3 c2t10d0 c2t11d0 c2t12d0 c2t13d0 c2t14d0 c2t15d0 c2t16d0 c2t17d0 c2t18d0 c2t19d0 c2t20d0 c2t21d0 c2t22d0 c2t23d0 c2t24d0 c2t25d0

To act as a read-cache, we also have 2x 160GB Intel SSDs, so we need to add them as L2ARC cache into the array:

zpool add tank cache c3t4d0 c3t5d0

You can’t mirror L2ARC, though I did read some interesting thoughts on doing so. – I don’t think its necessary in our case, however I could see how suddenly losing the cache in some environments might have a massive performance impact that could become mission critical.

We’ve also got a couple of SSDs in the system to act as the ZFS intent log (ZIL), so we’ll add them to the array:

zpool add tank log mirror c3t2d0 c3t3d0

Note that they are mirrored. The ZIL is used to speed up write by writing the data through SSD before going back to disk. Its used as a last chance should a write to spinning disk not have been completed in the event of a system event (e.g. power failure). With high-speed drives, its perhaps not necessary, but under heavy lead will give write performance. Given its important its consistent, we mirror it and its purpose is to allow a write to be acknowledged to a client even it its not been fully committed to spinning disk.

Ideally the drives should be SLC rather than MLC, but there’s a cost trade-off to be had there.

De-duplication and compression

ZFS allows de-duplication and compression to be enabled at pool or dataset/file-system level. We’ll enable both at the pool and it will inherit from there:

zfs set dedup=on tank
zfs set compression=on tank
zfs get compression tank
zfs get compressratio tank

The latter command will show the compression ratio and will vary over time depending on what files are in the tank.

When thinking about de-duplication, its important to consider how much storage you have and how much RAM is available. This is because you want the de-duplication hash tables to be in memory as much as possible for write speed. We’ve only got 12GB RAM in our server, but from what I’ve read, the  L2ARC SSDs should also be able to hold some of the tables and pick up some of the workload there.

When setting up de-duplication, we had some discussions about hashing and how it might work in relation to hash clashes … Bonwick’s blog has a lot of answers on this topic!

Just to touch briefly on quotas when using compression and de-duplication. This isn’t something that I’ve seen an answer on, and I don’t have time to look at the source code … but, here’s my supposition. Quota’s do take account of the compressed files. i.e. the account for the blocks used on disk rather than the size of the file. Quota’s don’t take account of any de-duplicated data. I’m pretty sure this must be the case, otherwise the first block owner would be penalised, and the final block owner could suddenly go massively over quota if other copies were deleted.

And a quick note on compression … this is something which I was mildly bemused by on first look:

milo 31% du -sh /data/private/downloads/
4.1G	/data/private/downloads/
milo 32% date && cp -Rp /data/private/downloads/* . && date
Fri May 25 15:25:39 BST 2012
Fri May 25 15:28:29 BST 2012

milo 34% du -sh .
3.0G	.

So by the time the files were copied to the NFS mounted compressed ZFS file-system, they’d shrunk by 1.1G. Now thinking about this, I should have realised that du was showing the size of blocks used, rather than the file-sizes, but at first glance, I was a little bemused!

Automatic snapshots

We’d really like to have automatic snapshots available for our file-systems, and time-slider can provide this. The only real problem is that time-slider is a GUI application and on a file-server, this isn’t ideal. Anyway, I found its possible to configure auto-snapshots from the terminal, first off install time-slider:

 pkg install time-slider
 svcadm restart dbus
 svcadm enable time-slider

Note, if you don’t restart dbus after installing time-slider, you’ll get python errors out of the time-slider service like:

dbus.exceptions.DBusException: org.freedesktop.DBus.Error.AccessDenied: Connection ":1.3" is not allowed to own the service "org.opensolaris.TimeSlider" due to security policies in the configuration file

To configure, you need to set the hidden ZFS property on the pool and then enable the auto-snapshots that you require:

zfs com.sun:auto-snapshot=true tank
svcadm enable auto-snapshot:hourly
svcadm enable auto-snapshot:weekly
svcadm enable auto-snapshot:daily

This blog gives more details on configuring, in short, you need to export the auto-snapshot manifest, edit it and then re-import it.

However, you should also be able to manipulate the service using svccfg, for example:

svccfg -s auto-snapshot:daily setprop zfs/keep= astring: '7'

Would change the keep period for daily snapshots to 7, the properties that are interesting are (depending if you are changing monthly, daily, hourly or weekly:

svccfg -s auto-snapshot:daily listprop zfs
zfs           application
zfs/interval  astring  days
zfs/period    astring  1
zfs/keep      astring  6

To list your snapshots, use the following:

root@bigstore:~# zfs list -t snapshot
NAME                                                  USED  AVAIL  REFER  MOUNTPOINT
rpool/ROOT/openindiana-1@install                     17.7M      -  1.52G  -
rpool/ROOT/openindiana-1@2012-05-18-14:54:49          317M      -  1.82G  -
rpool/ROOT/openindiana-1@2012-05-24-10:58:07         65.6M      -  2.32G  -
tank@zfs-auto-snap_weekly-2012-05-24-12h14               0      -  90.6K  -
tank@zfs-auto-snap_daily-2012-05-27-12h14                0      -  90.6K  -
tank@zfs-auto-snap_hourly-2012-05-28-11h14               0      -  90.6K  -
tank/research@zfs-auto-snap_weekly-2012-05-24-12h14  55.0K      -  87.3K  -
tank/research@zfs-auto-snap_hourly-2012-05-24-13h14  55.0K      -  87.3K  -
tank/research@zfs-auto-snap_hourly-2012-05-25-09h14  64.7K      -   113K  -
tank/research@zfs-auto-snap_hourly-2012-05-25-11h14  55.0K      -   116K  -
tank/research@zfs-auto-snap_daily-2012-05-25-12h14   71.2K      -  25.2M  -
tank/research@zfs-auto-snap_hourly-2012-05-25-16h14   162K      -  3.00G  -
tank/research@zfs-auto-snap_daily-2012-05-26-12h14   58.2K      -  3.00G  -
tank/research@zfs-auto-snap_hourly-2012-05-27-04h14  58.2K      -  3.00G  -
tank/research@zfs-auto-snap_daily-2012-05-27-12h14       0      -  3.00G  -
tank/research@zfs-auto-snap_hourly-2012-05-28-11h14      0      -  3.00G  -

Just as a caveat on auto-snapshots, the above output is a system that has been running weekly, hourly and daily snapshots for several days. Note that the hourly snapshots seem a bit sporadic in their availability. This is because if the snapshot the intermediate (hourly or daily) snapshots are 0K in size, then they get removed. i.e. snapshots will only be listed where there’s actually a change in the data. This seems quite sensible really…

Finally, we want to make the snapshots visible to NFS users:

zfs set snapdir=visible tank/research

For Windows users connecting via the in-kernel CIFS server, the previous versions tab should work.

part 1 – Hardware and Basic information
part 2 – Base Operating System Installation
part 3 – Configuring Active Directory authentication integration
part 4 – Configuring the storage pool and auto snapshots
part 5 – NFS Server config & In-kernel CIFS

Building an OpenIndiana based ZFS File Server – part 2

OpenIndiana Installation

Getting OpenIndiana onto our file-server hardware was a pretty simple affair, download the memory stick image and dd it onto a fresh memory stick and install … actually we did struggle with this to start with – we were planning on testing it on an old PC in my office (an 2007 generation 965 based Core2 Duo PC – can’t believe these are machines coming out of desktop service!)

Basically follow the text based installer – its quite reminiscent of pre-Jumpstart installs! We didn’t enable any additional users as we were planning on integrating with our Active Directory for authentication. If you do, then the installer will disable root logins by default.

The installer creates a default ZFS rpool for the root file-system, as we want that to be a mirror, we needed to add a second device once it had booted. This blog entry has some instructions on this. As its been a while since I’ve done Solaris, I’d actually approached it via a different manner (used format and fdisk to manipulate the disk partitions by hand, but prtvtoc | fmthard is of course the way we used to do it with Solstace DiskSuite, and is quicker and easier.

If we hadn’t already installed Linux onto the disk to test initially, we probably wouldn’t have got the error

cannot label 'c3t1d0': EFI labeled devices are not supported
    on root pools

when trying to add the disk directly when running

zpool attach rpool c3t0d0s0 c3t1d0s0

Clearing and reinitialising the partition table resolved that problem though. Don’t forget to use the installgrub command on the second disk to install the boot loader onto the disk, otherwise you won’t actually be able to boot off it in the event of a primary disk failure!

installgrub -m /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c3t1d0s0

Ensure that the disks have finished resilvering before you reboot the system! (zpool status will tell you!).

The resilver process in ZFS is different to those familiar with mirrors in DiskSuite or linux software RAID as it doesn’t create an exact block for block mirror of the disks, it only copies data blocks, so the data layout on the disks is likely to be different. It also means that if you have a 100GB drive with only 8GB data, you only need to mirror the 8GB of data, not all the empty file-system blocks as well …

Updating the OS

Since OpenIndiana 151a was released, there’s a bunch of updates been made available, so you want to upgrade to the latest image. First off, update the pkg package:

pkg install package/pkg

You can then do a trial run image update:

pkg image-update -nv

Run it again without the “n” flag to actually do the update. This will create a new boot environment, you can list using the command:

root@bigstore:~# beadm list
BE            Active Mountpoint Space Policy Created
openindiana   -      -          8.17M static 2012-05-18 15:11
openindiana-1 NR     /          3.95G static 2012-05-18 15:54
openindiana-2 -      -          98.0K static 2012-05-24 11:58

That’s it, you’re installed and all up to date!

In addition to the base Operating System repositories, we also add some extra repos:

pkg set-publisher -p http://pkg.openindiana.org/sfe
pkg set-publisher -p http://pkg.openindiana.org/sfe-encumbered

Enabling Serial Console

All of our servers are connected to Cyclades serial console servers, this lets us connect to the serial ports via ssh from remote locations, which is great when bits of the system go down. To enable to serial console, you need to edit the grub config file. As we have ZFS pool zpool, its located at /rpool/boot/grub/menu.lst. You need to comment out the splashimage line if present (as you can’t display the XPM down a serial port!), then add a couple more lines to enable serial access for grub.

 ###splashimage /boot/grub/splash.xpm.gz
 serial --unit=0 --speed=9600
 terminal --timeout=5 serial

You also need to find the kernel line and append some data to that:

-B console=ttya,ttya-mode="9600,8,n,1,-"

As there was already a -B flag in use, our resulting kernel line looked like this:

kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=ttya,ttya-mode="9600,8,n,1,-"

You might want to tweak the line-speed, parity etc for your environment.

Static IP address

For our servers, we prefer to use static IP addressing rather than have then DHCP. As we selected auto-configure during the installer, we need to swap to static IP address. Note that if you don’t have access (serial console or real console, you’ll likely get disconnected)!

First off, we need to disable the DHCP client. OpenIndiana dhcpclient is part of the nwam service:

svcadm disable network/physical:nwam

Boom! Down goes the network connection. So make sure you have access via another method! I’ve been adminning Solaris for a long time (since 2001), so I’m going to configure networking in the traditional (pre Solaris 10) manner … with some files, but you can do this step with ipadm command.

/etc/hostname.ixgbe0
    bigstore
/etc/hosts
    147.188.203.45 bigstore bigstore.cs.bham.ac.uk bigstore.local
    ::1 bigstore bigstore.local localhost loghost
/etc/defaultrouter
    147.188.203.6
/etc/netmasks
    147.188.0.0     255.255.255.0
/etc/resolv.conf
    domain  cs.bham.ac.uk
    search cs.bham.ac.uk
    nameserver  147.188.192.4
    nameserver  147.188.192.8
cp /etc/nsswitch.dns /etc/nsswitch.conf

Finally we need to enable the static IP configuration service:

svcadm enable network/physical:default

part 1 – Hardware and Basic information
part 2 – Base Operating System Installation
part 3 – Configuring Active Directory authentication integration
part 4 – Configuring the storage pool and auto snapshots
part 5 – NFS Server config & In-kernel CIFS

Building an OpenIndiana based ZFS File Server

We’ve recently been looking at options to replace one of our ageing file-servers which stores research data. The data is currently sat on an out-of-warranty Infortrend RAID array which we purchased back in July 2005. Basically its got 3TB of storage, so was a reasonable size when we bought it. Its attached to a Sun V20z box running Linux, serving data via NFS and Samba.

Our home directory storage is hung off a NetApp FAS3140 filer head, but we just couldn’t afford to purchase additional storage on there for the huge volumes of research data people want to keep.

So we looked around. Spoke to some people. Then came up with a plan. We’ve decided to home-brew a file-server based on ZFS as the underlying file system. ZFS was developed by Sun for Solaris and is designed to cope with huge file-systems, and provides a number of key features for modern file-systems, including snapshotting, deduplication, cloning and management of disk storage in pools. You can create and destroy file systems on the fly within a ZFS pool (a file-system looks like a directory in userland, you just create it with a simple zfs create command).

Since OpenSolaris, there’ve been ports of ZFS into other operating systems, FreeBSD has an implementation, and there’s event ZFS on Linux.

Hardware spec

As we were pondering an Illumos based OS for the system, we wanted kit we’d be fairly happy was supported. There’s no real compatibility list for Illumos, but digging around we found some stuff we’d be fairly happy would work.

One pair of 40GB SSDs is for the Operating System mirrored pair, one pair of mirrored SSDs for the ZFS intent log (ZIL), and the 2 160GB SSDs are to provide an L2ARC cache.

Once the kit arrived and was racked up, a bit of testing was done, but I was out of the office for a few weeks so didn’t inspect it properly until later. In short, the External SAS module (based on LSI 1604e chipset) was causing us some problems – the 3TB drives were being truncated to 2TB. After several calls to Viglen technical support, it transpires that the external SAS module doesn’t support 3TB drives. They’ve agreed to ship out a new LSI 9205-8e HBA. We believe this should work fine with OpenIndiana.

Intially we were offered a MegaRAID card, but there’s no mention of a Solaris 10 driver on the LSI site, so we steered clear of it.

And for an operating system?

We started off with one of our standard Scientific Linux 6 server installs to test out the hardware. We also built up ZFS on Linux to try it out. It works great (and we later even imported a pool created under Linux into a SmartOS instance). Our main issues with this approach are:

  • We have to build ZFS for the kernel, and kernel updates would break our ZFS access – we’d have to rebuild for each update.
  • ZFS on Linux with the POSIX layer isn’t as mature as ZFS on Illumos
  • We’d have to use Samba to share to our Windows clients
  • ZFS snapshots are inaccessible from NFS clients

We were always planning on looking at an Illumos based build (we did consider Oracle Solaris on non-Oracle hardware, but the cost is quite high). So we looked around at our options.

SmartOS

Hey wouldn’t it be neat if we could run SmartOS on it – this an Illumos based build which boots of a memory stick, and then you run KVM/Zones virtual machines inside to provide services. The outer SmartOS is transient, but the inner VMs are persistent on the ZFS pool. The big problem with this though is that you can only run an NFS server in the global zone as its tightly hooked into the kernel. This basically scuppers the plan to use SmartOS as our option here.

OpenIndiana

There’s numerous other Illumos based distributions about, but OpenIndiana is probably the closest to how OpenSolaris was when it was “closed”, and possibly the most stable of the distributions.

Installing OI is easy from USB memory stick or CD. Getting it fully integrated into our system was another matter! We had a bunch of issues with Kerberos authentication (we authenticate Linux against our AD), autofs and our AD/LDAP maps, and then getting the in-kernel CIFS service working as we have a disjoint DNS namespace (our Windows DNS name is different from our normal working cs.bham.ac.uk) – it broke the Kerberos authentication we’d already got working!

I’ll post some more details on getting things working in another article!

part 1 – Hardware and Basic information
part 2 – Base Operating System Installation
part 3 – Configuring Active Directory authentication integration
part 4 – Configuring the storage pool and auto snapshots
part 5 – NFS Server config & In-kernel CIFS

High availability and clustering

Whilst we have pretty good uptime and availability on most of the systems we run here, we do get the odd hardware problem. When its a disk that’s gone down, that’s not a problem as our physical servers are all running mirrored pairs in hot-swap enclosures, so we can swap out easily.

We’ve recently been doing a lot of work into reducing the number of physical servers we have by moving into a virtualised environment (QEMU+KVM running on Scientific Linux 6 x86_64 hardware). We’ve got some fairly major drivers to virtualise more hardware due to an ageing server estate. Virtualising our estate gives us the opportunity to review how we’ve got systems configured and to look at if we can improve service by introducing high-availability and load-balancing.

In the past week I’ve been looking at our web presence www.cs.bham.ac.uk. The current infrastructure for this was installed in 2006 and things were a little different back then. Before the 2006 system, the site was running on a Sun Ultra 5 with NFS mounted file-store. The power supply to the building was unreliable, and the file-server could take several hours to reboot following a power-outage. There was demand to be able to provide at least some web-presence for external users whilst waiting for the systems to reboot. We built a system out of two Sun X2100 servers, one a stand-alone web-server serving the very front few web-pages, the second used NFS mounts to server the rest of the site. We used the Apache httpd proxy module to transparently forward on requests to the back-end hardware.

Fast forward 6 years. We still use an NFS file-server for all our stuff, but its considerably more reliable. You’d expect that, we have a NetApp storage system. We also have high speed interconnects between our switching fabric and the network (10GbE links to the filer and between the core fabric).

LVS Direct Routing

We’ve got a pair of SL6 machines running in our virtual environment which are configured as a hot/standby LVS load-balancer. We’re using piranha to manage to LVS configs. Its something I’ve used before, but never really in anger (we did some experimentation a few years ago running Samba in a cluster. It was fine till we did fail-over of a node…). LVS is actually really easy to get installed and working, and the fail-over between nodes seems to work reliably. Its not great when you change the config as you have to restart pulse. And the second node has a habbit of taking over the cluster at that point!

I’ve been pondering how to move the www service into a high-availabilty configuration. One option is to continue to have a front-end node using mod_proxy to forward request to a backend set of servers. The problem with this is that the front-end server will either be a physical machine, or be a VM (which requires our NFS file-server for the VM host servers). It could be a HA cluster of front-end machines, but still it will rely on the load balancers, which again are VMs and require the NFS server to be up.

mod_proxy

We already use mod_proxy in httpd to handle calls to back-end servers, for example our “personal” home-pages are served from a completely separate VM. In the past personal pages with “issues” have had an impact on the main www web presence. So we separated them, the following config snippet shows what we do here:

    <Proxy balancer://staffwebcluster>
      BalancerMember http://staffweb-lb-1.cs.bham.ac.uk loadfactor=1
      BalancerMember http://staffweb-lb-2.cs.bham.ac.uk loadfactor=1
      ProxySet lbmethod=bytraffic
    </Proxy>
    <Proxy balancer://staffwebclusterssl>
      BalancerMember https://staffweb-lb-1.cs.bham.ac.uk loadfactor=1
      BalancerMember https://staffweb-lb-2.cs.bham.ac.uk loadfactor=1
      ProxySet lbmethod=bytraffic
    </Proxy>
    RewriteCond %{HTTPS} on
    RewriteRule ^/~(.*) balancer://staffwebclusterssl/~$1 [P]
    RewriteRule ^/~(.*) balancer://staffwebcluster/~$1 [P]

The problem with this approach for a “front-end” server cluster is that we’ve carefully used firewall marks on the load balancer to ensure visitors hit the same front-end server for both HTTP and HTTPS transactions, but then we have no control over the backend server the client connects to as its determined by Apache’s load balancing.

So given the fact we rely on our NFS server being up for practically the whole system to be available, do we really still need to have a front-end/back-end configuration. I’m fairly sure we don’t in real terms. So whilst we’ll continue to use mod_proxy to allow us to run whole sections of our web-server on different real hosts (some parts of the www site are even proxied to an IIS server), we’ll be dropping the front-end/back-end approach and letting the load-balancer handle the traffic for us.

Research Computing Blogs …

A few years ago we looked at providing a blog service to support a teaching module, back then the only option for multiple blogs was to either install multiple instances of WordPress, or opt for WordPress MU. Happily things have moved on a long way with WordPress since then. There’s now “networks” which form an integrated part of the WordPress code base – with MU, we quickly found we were on an outdated version.

http://researchblogs.cs.bham.ac.uk

So proudly, today I announce that we’re now providing http://researchblogs.cs.bham.ac.uk/ to allow research members of the school the ability to create blogs about their research.

We really don’t just like bunging in new systems which aren’t integrated into anything else, so we’re using a couple of plug-ins to help tie authentication into our normal authentication systems. The http-authentication plugin allows one to use Apache auth to provide logins. A few years back, I wrote the authentication module we use, and this provides integrated cookie based authentication across a number of our sites.

So researchers here can now register on-line for blogging. This sets them up inside the WordPress world.

Why not just use normal WordPress registration?

Whilst WordPress allows configuration options to disable registration or to restrict to email domains, we only want to allow our research staff at present to register, so we’ve provided a click to register option. We also don’t want to allow anyone who can use the system to create their own blogs, again we’d like to restrict that to a sub-set of users.

There’s no easy way to accomplish this sensibly with WordPress right now – there’s no command line tools, and poking things directly into the WordPress database is just going to cause trouble in the future, so I’ve written some code to act as an API – internally my API code uses the Curl module in PHP to authenticate into WordPress as a trusted user and then allows it to make calls from the WordPress function reference, so for example, my API code logs in as an admin user internally and then calls the get_blog_details function to find out info on a blog. The main page of the site uses this internally to render all the blogs (and a tweak to the index handler for Apache to load a different page by default). This means we can list currently active and archived blogs, which is derived using the WordPress functions, rather than poking into the database directly.

As we know our researchers have collaborators, we’ve also built part of the API so that allows staff to add external users which are authenticated using the WordPress internal authentication system, so they’ll be able to add collaborators and be able to allow them to post onto research blogs. Hopefully this will make it a workable solution for our researchers!

And what’s with the robot man?

That’s the building we’re located in. The statue is right outside my office. Its called “Faraday” and was designed by Eduardo Paolozzi.

And if we’ve changed the theme since posting … he’ll be gone by now!

Posted in Web