Solaris 10 Release and Solaris Cluster 3.x upgrade + patching

On the day of writing, the newest version of Solaris 10 is u11 (1/13) and Cluster (for Solaris 10) is 3.3u2.
Cluster 3.2 is still supported by Oracle, but patches are no longer released. Update 11 for Solaris 10 is probably last Release of Solaris 10 and has some new features, which are not available if you only patching a system with Recommended Patches.
Before starting with upgrade it is good to check and fix any issues with current system and environment.
Please check a Cluster status, Quorum device status, running services, zpools, metadevices, metasets, hardware components etc.
If you have any issues, fix them before you start. Plan your Maintenance Window, make backup of your files and configuration.

Now let’s check which Solaris Release and Cluster Release is installed:

$ cat /etc/release
                   Oracle Solaris 10 9/10 s10s_u9wos_14a SPARC
     Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.
                            Assembled 11 August 2010
$ cat /etc/cluster/release
                     Sun Cluster 3.2u1 for Solaris 10 sparc
           Copyright 2008 Sun Microsystems, Inc. All Rights Reserved.

If you have the newest version of OS or Cluster, you can just omit related part of this article.
Prepare ISO with Solaris 10 u11, Recommended Patch Cluster, LU patch, Cluster 3.3u2 package and newest Cluster Core patch. You can download binaries from Oracle after login to Oracle Support. If you don’t have account, you can create it. AFAIK Contract is not needed to download software and patches.
Copy above software to servers and unpack in local directory. Do not use shared directory from NFS, because you will not be able to install patches. Leave ISO image as is, it will be mounted as lofs.
A propos lofs: if you have automounter (autofs service) and SUNW.nfs resource enabled, you probably have lofs disabled in your /etc/system.

# grep lofs /etc/system
exclude: lofs

In that case, before you start OS Release upgrade you will need to enable lofs (remove or comment above line in /etc/system) and reboot. There is no other way, sorry.
Let’s start with preparation phase, I will use /var/tmp/patching as software dir and /mnt as ISO/ABE dir. To /var/tmp/patching/prereq I copied patches which is good to have installed before Alternate Boot Environment creation and patching. This is my list:
119254-92
121428-15
121430-93
140914-02
142911-01
142933-05
146578-06
148027-06

# mkdir /var/tmp/patching && cd /var/tmp/patching
# unzip -q 10_Recommended.zip && rm 10_Recommended.zip
# cd prereq
# for a in `ls *.zip`; do unzip -q $a && rm $a; done
# for a in `ls *`; do patchadd $a; done
# cd ..

Do not worry if one or more of above patches won’t install, the most important patch is 121430-XX, which is Live Upgrade Patch. Without this patch your chances to hit a bug with ABE creation are significally increased. For x86 systems this patch has number increased by one: 121431-XX.
Now it’s time to apply Prereq patches from Recommended Patch Cluster:

# cd /var/tmp/patching/10_Recommended
# ./installpatchset --apply-prereq --s10patchset
Recommended OS Patchset Solaris 10 SPARC (2015.12.24)

Application of patches started : 2016.02.19 07:23:42

Applying 120900-04 ( 1 of 10) ... skipped
Applying 121133-02 ( 2 of 10) ... skipped
Applying 119254-92 ( 3 of 10) ... skipped
Applying 119317-01 ( 4 of 10) ... skipped
Applying 121296-01 ( 5 of 10) ... skipped
Applying 138215-02 ( 6 of 10) ... success
Applying 148336-02 ( 7 of 10) ... success
Applying 146054-07 ( 8 of 10) ... skipped
Applying 142251-02 ( 9 of 10) ... skipped
Applying 125555-16 (10 of 10) ... success

Application of patches finished : 2016.02.19 07:25:41

Following patches were applied :
 138215-02     148336-02     125555-16

Following patches were skipped :
 Patches already applied
 120900-04     119254-92     121296-01     146054-07     142251-02
 121133-02     119317-01

Installation of prerequisite patches complete.

Install log files written :
  /var/sadm/install_data/s10s_rec_patchset_short_2016.02.19_07.23.42.log
  /var/sadm/install_data/s10s_rec_patchset_verbose_2016.02.19_07.23.42.log

It’s about 10 patches which should be installed on running system before patching. Patch the system before patching the system 🙂
Now we will create an Alternate Boot Environment (ABE). If you have root filesystem on ZFS, just make:

# lucreate -c CURRENT_NAME -n NEW_BE_NAME

and that’s all. ZFS snapshots will be created and new ZFS filesystems for ABE. You can check this with ‘zfs list’ and ‘lustatus’ commands.
If you have UFS mirror for root filesystem, you need to split mirror and create ABE on deattached metadevice. But first you need to remove ‘global’ option from /etc/vfstab for /global/.devices/node@X. Make a copy, use your favorite editor and replace ‘global’ with ‘-‘, like below:

#/dev/md/dsk/d1160        /dev/md/rdsk/d1160       /global/.devices/node@1 ufs     2       no      global
/dev/md/dsk/d1160        /dev/md/rdsk/d1160       /global/.devices/node@1 ufs     2       no      -

And the same on other Cluster nodes.
Let’s assume that metadevices and mountpoints are like follows:

d1160            m  516MB d1161 d1162
    d1161        s  516MB c0t2d0s6
    d1162        s  516MB c0t3d0s6
d1130            m   19GB d1131 d1132
    d1131        s   19GB c0t2d0s3
    d1132        s   19GB c0t3d0s3
d1100            m   43GB d1101 d1102
    d1101        s   43GB c0t2d0s0
    d1102        s   43GB c0t3d0s0
d1110            m  4.0GB d1111 d1112
    d1111        s  4.0GB c0t2d0s1
    d1112        s  4.0GB c0t3d0s1
/dev/md/dsk/d1110       -       -       swap    -       no      -
/dev/md/dsk/d1100       /dev/md/rdsk/d1100      /       ufs     1       no      -
/dev/md/dsk/d1130       /dev/md/rdsk/d1130      /export ufs     2       yes     -
#/dev/md/dsk/d160       /dev/md/rdsk/d160       /globaldevices  ufs     2       yes     -
/devices        -       /devices        devfs   -       no      -
ctfs    -       /system/contract        ctfs    -       no      -
objfs   -       /system/object  objfs   -       no      -
swap    -       /tmp    tmpfs   -       yes     -
/dev/md/dsk/d1160       /dev/md/rdsk/d1160      /global/.devices/node@1 ufs     2       no      global

I will use truss to collect output during lucreate in case of any issues:

truss -o lucreate.out -elaf \
  lucreate -c s10 -n s10u11_20160219 \
  -m /:/dev/md/dsk/d100:ufs,mirror -m /:/dev/md/dsk/d1102:detach,attach,preserve \
  -m /export:/dev/md/dsk/d130:ufs,mirror -m /export:/dev/md/dsk/d1132:detach,attach,preserve \
  -m -:/dev/md/dsk/d110:swap,mirror -m -:/dev/md/dsk/d1112:detach,attach,preserve \
  -m /global/.devices/node@1:/dev/md/dsk/d160:ufs,mirror \
  -m /global/.devices/node@1:/dev/md/dsk/d1162:detach,attach,preserve

There is another school, which tells that one should split mirror manually and then make a lucreate, but above scenario is much faster, because data are up to date, and lucreate does not need to copy them. ABE creation process looks as follows:

Determining types of file systems supported
Validating file system requests
Preparing logical storage devices
Preparing physical storage devices
Configuring physical storage devices
Configuring logical storage devices
Analyzing system configuration.
Updating boot environment description database on all BEs.
Updating system configuration files.
The device </dev/dsk/c0t3d0s0> is not a root device for any boot environment; cannot get BE ID.
Creating configuration for boot environment <s10u11_20160219>.
Source boot environment is <s10>.
Creating file systems on boot environment <s10u11_20160219>.
Creating <ufs> file system for </> in zone <global> on </dev/md/dsk/d100>.
Creating <ufs> file system for </export> in zone <global> on </dev/md/dsk/d130>.
Creating <ufs> file system for </global/.devices/node@1> in zone <global> on </dev/md/dsk/d160>.
Mounting file systems for boot environment <s10u11_20160219>.
Calculating required sizes of file systems for boot environment <s10u11_20160219>.
Populating file systems on boot environment <s10u11_20160219>.
Analyzing Primary boot environment.
Processing alternate boot environment.
Mounting ABE <s10u11_20160219>.
Cloning mountpoint directories.
Generating list of files to be copied to ABE.
Copying data from PBE <s10> to ABE <s10u11_20160219>.
100% of filenames transferred
Finalizing ABE.
Unmounting ABE <s10u11_20160219>.
Reverting state of zones in PBE <S10U11>.
Making boot environment <s10u11_20160219> bootable.
Setting root slice to Solaris Volume Manager metadevice </dev/md/dsk/d100>.
Population of boot environment <s10u11_20160219> successful.
Creation of boot environment <s10u11_20160219> successful.

If you have any issues with ABE creation please check: if LU patch (121430-XX or 121431-XX for x86) is installed, if metadevice names are OK, if you didn’t make a typo in lucreate command etc. If you believe that everything is OK, analyze truss output or send it to Oracle Support. If you made a typo and mirrors are already splitted, do not worry, just do ‘old school’ lucreate, like that.

truss -o lucreate1.out -elaf \
  lucreate -c s10 -n s10u11_20160219 \
  -m /:/dev/md/dsk/d100:ufs \
  -m /export:/dev/md/dsk/d130:ufs \
  -m -:/dev/md/dsk/d110:swap \
  -m /global/.devices/node@1:/dev/md/dsk/d160

You will need to wait some time for data synchronization betweend current and new BE. Check BE with lustatus:

# lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
s10                        yes      yes    yes       no     -
s10u11_20160219            yes      no     no        yes    -

After ABE creation you should revert ‘global’ option in /etc/vfstab.
Now it’s time to upgrade OS Release on ABE if it’s older than u11 (1/13). Mount the Solaris 10 u11 ISO image and make luupgrade:

# lofiadm -a /var/tmp/patching/sol-10-u11-ga-sparc-dvd.iso
/dev/lofi/1
# mount -F hsfs /dev/lofi/1 /mnt
# echo "autoreg=disable" > /var/tmp/patching/no-autoreg
# luupgrade -u -s /mnt -k /var/tmp/patching/no-autoreg -n s10u11_20160115

This will take a while, so let’s make some coffee 🙂
If you have troubles with luupgrade like ‘ERROR: Cannot mount miniroot at …’, scroll up and read about ‘lofs’, you will need additional reboot to go further.
After successful upgrade it’s time for patching ABE.

# umount /mnt
# lofiadm -d /dev/lofi/1
# cd /var/tmp/patching/10_Recommended
# ./installpatchset -B s10u11_20160219 --s10patchset

Recommended OS Patchset Solaris 10 SPARC (2015.12.24)

Application of patches started : 2016.02.19 10:55:21

This will also take some time, but finally you should receive:

Application of patches finished : 2016.02.19 14:09:45


Following patches were applied :
 118666-86     125731-12     148031-05     149175-09     150537-01
 118667-86     126206-11     148049-04     149279-04     150539-01
...
Installation of patch set to alternate boot environment complete.

Please remember to activate boot environment s10u11_20160219 with luactivate(1M)
before rebooting.

Install log files written :
  /.alt.s10u11_20160219/var/sadm/install_data/s10s_rec_patchset_short_2016.02.19_10.55.21.log
  /.alt.s10u11_20160219/var/sadm/install_data/s10s_rec_patchset_verbose_2016.02.19_10.55.21.log

If you don’t have Solaris Cluster installed or you have already newest version with patches, you can just activate new BE with luactivate and reboot system to new Release and Patchstand. DO NOT use ‘reboot’ nor ‘halt’, you need to use ‘init’ command.
If you want to upgrade Cluster, let me continue. Mount ABE and upgrade Cluster Framework:

# lumount s10u11_20160219 /mnt
# cd /var/tmp/patching/SC-3.3/
# cd Solaris_sparc/Product/sun_cluster/Solaris_10/Tools
# ./scinstall -u update -R /mnt

Starting upgrade of Oracle Solaris Cluster framework software

Saving current Oracle Solaris Cluster configuration
Do not boot this node into cluster mode until upgrade is complete.
Renamed "/mnt/etc/cluster/ccr" to "/mnt/etc/cluster/ccr.upgrade".

** Removing Oracle Solaris Cluster framework packages **
...
** Installing SunCluster 3.3 framework **
        SUNWscu.....done
        SUNWsccomu..done
...
Restored  /mnt/etc/cluster/ccr.upgrade to /mnt/etc/cluster/ccr
Completed Oracle Solaris Cluster framework upgrade
Updating nsswitch.conf ... done
Log file - /mnt/var/cluster/logs/install/scinstall.upgrade.log.20131

Now upgrade Cluster Agents:

# cd /mnt/usr/cluster/bin
# ./scinstall -u update -R /mnt -s all -d /var/tmp/patching/SC-3.3/Solaris_sparc/Product/sun_cluster_agents
Starting upgrade of Oracle Solaris Cluster data services agents

List of upgradable data services agents:
  (*) indicates selected for upgrade.
        * nfs
** Removing HA-NFS for Oracle Solaris Cluster **
        Removing SUNWscnfs...done

** Installing Oracle Solaris Cluster HA for NFS **
        SUNWscnfs...done

Completed upgrade of Oracle Solaris Cluster data services agents
Log file - /mnt/var/cluster/logs/install/scinstall.upgrade.log.13042

And apply the newest Cluster Core Patch:

# cd /var/tmp/patching/SC-3.3
# patchadd -R /mnt 145333-34

Don’t forget to set ‘global’ option in new vfstab, then umount ABE:

# vi /mnt/etc/vfstab
# luumount /mnt

Make the same for all nodes in your Cluster, and wait for Maintenance Window to switch Boot Environments.

What to do during Maintenance Window?
It’s good to upgrade Firmware if you don’t have newest already installed, so download it and upload to Service Controller or TFTP/FTP server.
Check cluster and quorum status once again:

# cluster status
# clq status
# clrg status

Check and remove mediators from metasets, if any:

# medstat -s metaset1
# metaset -s metaset1 -d -m NODE1 NODE2
# medstat -s metaset1

Evacuate first node and activate new BE:

# clnode evacuate NODE1
# luactivate s10u11_20160219
# lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
s10                        yes      yes    no        no     -
s10u11_20160219            yes      no     yes       no     -

Shutdown node with ‘init 0’, DO NOT use ‘reboot’ nor ‘halt’

# init 0

Leave this node down, you can upgrade Firmware if there is any new.
Make the same steps, starting with ‘clnode evacuate NODEx’ for all nodes in Cluster.
Now you should start nodes in opposite order, so first boot a node, which was last stopped, and then join other nodes to Cluster.
That’s almost all. Almost, because resource types should be upgraded now:

# clrs list -v
# clrt show -v |grep Upgrade

This will show what types of resources are used and when they can be upgraded. Most of them can be upgraded ‘Anytime’ or ‘When unmonitored’, so downtime is not needed. Lets’s do this for HAStoragePlus and LogicalHostname resources:

# clrt register HAStoragePlus
# clrt register SUNW.LogicalHostname
# clrt show -v |grep Upgrade
  --- Upgrade tunability for Resource Type SUNW.HAStoragePlus:10 ---
  Upgrade from 9:                               When unmonitored
  Upgrade from 8:                               When unmonitored
  Upgrade from 7:                               When unmonitored
  Upgrade from 6:                               When unmonitored
  --- Upgrade tunability for Resource Type SUNW.LogicalHostname:4 ---
  Upgrade from 3:                               Anytime
  Upgrade from 2:                               Anytime

# for a in `clrs list -v|grep SUNW.HAStoragePlus:6|cut -d" " -f1`; do 
>  clrs unmonitor $a
>  clrs set -p Type_version=10 $a
>  clrs monitor $a
> done
# clrt unregister HAStoragePlus:6
#
# for a in `clrs list -v|grep SUNW.LogicalHostname:2|cut -d" " -f1`; do 
>  clrs set -p Type_version=4 $a
> done
# clrt set -p RT_system=False SUNW.LogicalHostname:2
# clrt unregister SUNW.LogicalHostname:2

That’s all folks, good luck!
Pssst! Don’t forget to delete old BE and reattach disk to mirror to have redundancy back on UFS.

You may also like...

1 Response

  1. Milind says:

    Hi,

    Thanks a lot for this explanation.
    I have to patch Sun Cluster to Solaris 10 Update 11. Cluster Setup includes one fail-over zone. Zone has metaset configured.
    Also there are couple of sand-alone zones which are running on current passive node.

    Kindly suggest me best possible plan.

    Milind

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.