From Unix Wiki
Jump to: navigation, search


Live Upgrade

Error `Reading /etc/group failed`

When you see this during creation of new live upgrade area you may need to check /etc/group syntax. There are shouldn't be double columns after usernames, or single columns after groupname and groups with same ID in this file.

Maintain vCPU and gather info

List your vCPU's

# psrinfo

Get number of vCPU's

# psrinfo -p

Get datiled infromation about vCPU

# psrino -pv

and prtdiag (working only on SPARC)

# prtdiag
==================================== CPUs ====================================
      CPU                 CPU                         Run    L2$    CPU   CPU
LSB   Chip                 ID                         MHz     MB    Impl. Mask
---   ----  ----------------------------------------  ----   ---    ----- ----
 00     0      0,   1,   2,   3,   4,   5,   6,   7   xxxx   x.x        7  961
 00     1      8,   9,  10,  11,  12,  13,  14,  15   xxxx   x.x        7  961
 00     2     16,  17,  18,  19,  20,  21,  22,  23   xxxx   x.x        7  961
 00     3     24,  25,  26,  27,  28,  29,  30,  31   xxxx   x.x        7  961
 01     0     32,  33,  34,  35,  36,  37,  38,  39   xxxx   x.x        7  961
 01     1     40,  41,  42,  43,  44,  45,  46,  47   xxxx   x.x        7  961
 01     2     48,  49,  50,  51,  52,  53,  54,  55   xxxx   x.x        7  961
 01     3     56,  57,  58,  59,  60,  61,  62,  63   xxxx   x.x        7  961

To maintain your vCPU's use psradm

To disable vCPU with id 1

# psradm -f 1

To enable vCPU with id 0

# psradm -n 0

..command also works with multiple vCPU's numbers

To offline vCPU forcely, use with -F option

To get your system architecture run:

# isainfo

To know 64 or 32 bit capabillity:

# isainfo -b

Solaris port checker script

Good script for port checking:

pids=$(/usr/bin/ps -ef -o pid=)
if [ $# -eq 0 ]; then
read ans?"Enter port you would like to know pid for: "
for f in $pids
/usr/proc/bin/pfiles $f 2>/dev/null | /usr/xpg4/bin/grep -q "port: $ans"
if [ $? -eq 0 ]; then
echo $line
echo "Port: $ans is being used by PID:\c"
pargs -l $f
#/usr/bin/ps -o pid,args -p $f

Just enter interested port and you will get programm path used this

...and another script to show only listened ports on Solaris:

#!/usr/bin/env perl
## Search the processes which are listening on the given port.
## For SunOS 5.10.
use strict;
use warnings;
die "Port missing" unless $#ARGV >= 0;
my $port = int($ARGV[0]);
die "Invalid port" unless $port > 0;
my @pids;
map { push @pids, $_ if $_ > 0; } map { int($_) } `ls /proc`;
foreach my $pid (@pids) {
   open (PF, "pfiles $pid 2>/dev/null |") 
       || warn "Can not read pfiles $pid";
   $_ = <PF>;
   my $fd;
   my $type;
   my $sockname;
   my $peername;
   my $report = sub {
       if (defined $fd) {
           if (defined $sockname && ! defined $peername) {
               print "$pid $type $sockname\n"; } } };
   while (<PF>) {
       if (/^\s*(\d+):.*$/) {
           $fd = int ($1);
           undef $type;
           undef $sockname;
           undef $peername; }
       elsif (/(SOCK_DGRAM|SOCK_STREAM)/) { $type = $1; }
       elsif (/sockname: AF_INET[6]? (.*)  port: $port/) {
           $sockname = $1; }
       elsif (/peername: AF_INET/) { $peername = 1; } }
   close (PF); }

Check no_exec_user_stack

echo noexec_user_stack/D | mdb -k

Package installation

For additional packages on Solaris 8,9,10 one can use opencsw repository.

Patch Solaris 11 packages without internet connection

Sometimes you need to update some Solaris 11 packages using updates from Oracle (in p5p format), but you don't have internet connection. There is instruction on how to do it simply and quickly:

1) Create empty IPS repository:

# mkdir /tmp/ipsrepo
# pkgrepo create /tmp/ipsrepo

2) Import packages from .p5p file to our repository:

# pkgrecv -s ./idr1401.p5p -d /tmp/ipsrepo '*'

3) Set preferred repository:

# pkg set-publisher -P -g file:///tmp/ipsrepo solaris

or remove all existing origin (if you have default Solaris 11 repo and don't use it) and set your new repo as default

# pkg set-publisher –G ‘*’ –M ‘*’ –g file:///tmp/ipsrepo solaris

or remov only specific origin

# pkg set-publisher -G file:///tmp/updates solaris

4) Install your patch:

# pkg install idr1401

Know Solaris 11 SRU level

Just issue this:

# pkg info entire | grep Summary | sed 's/.*[\(]\(.*\)[\)]./\1/'

Or examine pkg info entire output.

List patches by install date

How to get patch install date on Solaris

showrev -p | sort -k2,2 | while read line; do
   PATCHID=$( echo "${line}" |\
              sed 's/^Patch: \([0-9]\{6\}-[0-9]\{2\}\) .*$/\1/' )
   # Only need to grab the first package name, even if patch
   # has been installed to multiple packages - there will only be
   # a second or two difference between each pacakge
   PACKAGE=$( echo "${line}" |\
              sed 's/^.*Packages: \([^,]*\).*$/\1/' )
                sed 's/^Installed: \(.*\) From:.*$/\1/' ) 
   echo "${PATCHID}: ${INSTALLED}"
exit 0

Check print spooler service (Solaris 10)

# svcs -a | grep print
online Aug_13 svc:/application/print/cleanup:default
online 13:05:41 svc:/application/print/server:default
online 13:05:41 svc:/application/print/rfc1179:default
online 13:05:42 svc:/application/print/ipp-listener:default
# lpstat -r

Check free physical memory in Solaris

[root@host1 ~]# sar -r
SunOS host1 5.10 Generic_142900-13 sun4u    09/15/2010
00:00:08 freemem freeswap
00:15:06 19557199 257224020
00:30:07 19568546 259068172
00:45:06 19270095 258003409
[root@host1 ~]# pagesize

So formula is:

Free Memory= 19557199*8 = 156457592kb =152790Mb=149Gb

Dont apply this on Linux , It show in different way.


[root@host1 ~]# vmstat 2 10
kthr      memory            page            disk          faults      cpu  r b w   swap  free  re  mf pi po fr de sr s0 s1 s2 s3   in   sy   cs us sy id
2 20 0 154718432 190199784 1811 5261 1677 4992 4990 0 0 149 9 9 149 46293 278630 61146 4 6 90
26 0 128356800 150948664 2617 5529 11383 0 0 0 0 179 0 0 179 55395 286775 67779 5 11 84
Free Memory=(190199784/1024/1024)Gb

And to check what is physical memory you have in system

# prtconf -vp | grep -i mem
Memory size: 16384 Megabytes

And to see there bank , chip size.

# prtdiag -v
System Configuration:  Sun Microsystems  sun4u Sun Fire 480R
System clock frequency: 150 MHz
Memory size: 16384 Megabytes
========================= CPUs ===============================================
         Run   E$  CPU     CPU
Brd  CPU  MHz   MB  Impl.   Mask
— —– —- —- ——- —-
A     0  1200  8.0 US-III+  11.1
A     2  1200  8.0 US-III+  11.1
========================= Memory Configuration ===============================
         Logical  Logical  Logical
    MC   Bank     Bank     Bank         DIMM    Interleave  Interleaved
Brd  ID   num      size     Status       Size    Factor      with
—  —  —-     ——   ———–  ——  ———-  ———–
A    0     0      2048MB   no_status    1024MB     8-way        0
A    0     1      2048MB   no_status    1024MB     8-way        0
A    0     2      2048MB   no_status    1024MB     8-way        0
A    0     3      2048MB   no_status    1024MB     8-way        0
A    2     0      2048MB   no_status    1024MB     8-way        0
A    2     1      2048MB   no_status    1024MB     8-way        0
A    2     2      2048MB   no_status    1024MB     8-way        0
A    2     3      2048MB   no_status    1024MB     8-way        0
========================= IO Cards =========================

Solaris health check (Solaris 10)

This pretty script taken from Solarispedia, you can use it to do weekly health checks on Solaris systems. I modified some strings about memory and login check.

echo "Server name: `hostname`"
echo "Uptime: `uptime`"
memsz=`sar -r 1 1| tail -1 | awk '{printf $2}'`
freemem=`echo "(($memsz*($pagesz/1024))/1024)" | bc`
echo "Free physical memory: $freemem MB"
echo "Clock drift: `ntpdate -q`"
echo "Checking logs:"
/usr/xpg4/bin/grep -E 'error|critical|alert|warning' /var/adm/messages /var/adm/messages.0
echo "End checking logs"
outputOK () {
  echo $1 "[ OK ]"

outputW () {
  echo $1 "[ Warning ]"
#Check filesystems mounted
   # For ZFS, just list mounted filesystems
for fs in `zfs mount | awk '{ print $2 }'`
 outputOK "Filesystem: $fs  mounted"
# Show status of all mounts listed in vfstab
for vol in `/usr/xpg4/bin/grep -E 'vxfs|ufs|nfs|cifs' /etc/vfstab | egrep -v '^#' | awk '{ print $3 }'`
 if df -k $vol | grep $vol > /dev/null
  outputOK "Filesystem: $vol    mounted"
 else    outputW  "Filesystem: $vol    NOT MOUNTED"
#Disksuite checks
if [ `metadb 2>&1 | grep 'no existing databases' | wc -l` != 1 ]
 if [ `metadb 2>&1 | grep -v flags | grep -v a | wc -l` != 0 ]
  outputW "Disksuite: Inactive metadb(s) found"
  outputOK "Disksuite: All metadbs are active"
 outputOK "Disksuite: Not in use"
if [ `metastat 2>&1 | grep 'no existing databases' | wc -l` != 1 ]
 if [ `metastat 2>&1 | grep State: | egrep -v Okay | wc -l` != 0 ]
  outputW "Disksuite: Metadevices are showing errors" ; 
  outputOK "Disksuite: All metadevices are in Okay state" ; 
# Check multipathing working if SAN attached.
if [ `fcinfo hba-port|wc -l` != 1 ] 
 if [ `fcinfo hba-port|grep online|wc -l` != `fcinfo hba-port|grep State|wc -l` ] 
  outputW "SAN: Offline HBAs detected."
  outputOK "SAN: All HBAs are online" 
 outputOK "SAN: No HBAs found."
   # Need to update to handle .bak .orig, etc...
# Check interfaces are plumbed, up and full duplex.
for iface in `ls /etc/hostname.*|awk '{ FS=".";print $2 }'`
 if dladm show-dev | grep $iface | grep up | grep full > /dev/null
  outputOK "Network: $iface is up and full duplex"
  outputW "Network: `dladm show-dev | grep $iface`"
# Check for a default route
if netstat -rn|grep default > /dev/null 
 outputOK "Network: Default route found"
 outputW  "Network: Default route not found"
#All services have started
if [ `svcs -xv|wc -l` != 0 ] ; 
 outputW "Services: There are offline services"
 outputOK "Services: All services are online"
if [ `inetadm|/usr/xpg4/bin/grep -vE 'online|disabled'|wc -l` != 1 ] ; 
 outputW "Services: There are offline Inet services"
 outputOK "Services: All Inet services are online"
#Check zones are online
if [ `zoneadm list -civ | wc -l` != 2 ]
 if [ `zoneadm list -civ | grep -v ID | grep -v template | wc -l` != `zoneadm list -civ | grep running | wc -l` ]
  outputW "Zones: Not all local zones are running"
  outputOK "Zones: All local zones are running"
 outputOK "Zones: No local zones detected"
#Check size of mail queue
if [ `mailq | grep "requests: 0" | wc -l` != 1 ]
 outputW "Sendmail: mail queue greater than zero"
 outputOK "Sendmail: mail queue is empty"
#Find locked user accts.
for user in `grep '*LK*' /etc/shadow | /usr/xpg4/bin/grep -vE 'listen|gdm|webservd|nobody|noaccess|nobody4|uucp|lp|daemon|bin|nuucp'|awk '{ FS=":";print $1 }'`
 outputW "User: Account $user is locked"


Restart network service

# svcadm restart network/physical

or entire...

# svcadm restart physical

Create link aggregated interface


1. LACP supported by your router

2. GLDv3 supported by devices

3. Full duplex mode on both interfaces and same speed

4. Set obp local-mac-address? to true (# eeprom local-mac-address?=true)

Show network interfaces:

# dladm show-link

Unplumb network interfaces:

# ifconfig e1000g0 unplumb
# ifconfig e1000g1 unplumb

NOTE: need to backup or delete /etc/hostname.e1000g[1,2] because of errors during system startup

Create aggregated interface:

# dladm create-aggr -d e1000g0 -d e1000g1 1

Where "-d" is a key for added device name and last "1" is aggregated device key (aggr1 for example)

Plumb up interface:

# ifconfig aggr1 plumb up

Set persistent configuration:

# echo '' > /etc/hostname.aggr1

Perform reconfiguration reboot (# reboot -- -r) if needed.

Set up IPMP redundancy

Link-based IPMP

Link-based IPMP is a type of redundancy provided by MPATH daemon which detects status of interface and will bring up another interface with needed IP address.

To set up link-based IPMP need to perform:

a) With active configuration

1. Plumb up primary interface

# ifconfig e1000g0 plumb netmask + broadcast + group ipmp0 up

2. Plumb up secondary interface

# ifconfig e1000g1 plumb group ipmp0 up

3. Set something same for persistent configuration

# cat /etc/hostname.e1000g0 netmask + broadcast + group ipmp0 up
# cat /etc/hostname.e1000g1
group ipmp0 up

b) With active-standby configuration

1. Plumb up primary interface

# ifconfig e1000g0 plumb netmask + broadcast + group ipmp0 up

2. Plumb up secondary interface

# ifconfig e1000g1 plumb group ipmp0 standby up

3. Set something same for persistent configuration

# cat /etc/hostname.e1000g0 netmask + broadcast + group ipmp0 up
# cat /etc/hostname.e1000g1
group ipmp0 standby up

c) Test your redundancy

To detach(d) and re-attach(r) use

# if_mpadm -d e1000g0
# if_mpadm -r e1000g0

Probe-based IPMP

Probe-based IPMP provides redundancy by sending probe ICMP packets via test addresses to target systems on same subnet. There are two type of addresses: data address (used for actual data transfer) and test (used for testing failover).

To set up probe-based IPMP need to perform:

a) With active-active configuration

1. Allocate test and data addresses - data address - test address - test address

2. Plumb up primary interface

# ifconfig e1000g0 plumb netmask + broadcast + group ipmp0 up addif netmask + broadcast + deprecated -failover up

3. Plumb up secondary interface

# iconfig e1000g1 plumb netmask + broadcast + deprecated -failover group ipmp0 up

4. Set something same for persistent configuration

# cat /etc/hostname.e1000g0 netmask + broadcast + group ipmp0 up addif netmask + broadcast + deprecated -failover up
# cat /etc/hostname.e1000g1 netmask + broadcast + deprecated -failover group ipmp0 up

b) With active-standby configuration

1. Allocate test and data addresses - data address - test address - test address

2. Plumb up primary interface

# ifconfig e1000g0 plumb netmask + broadcast + group ipmp0 up addif netmask + broadcast + deprecated -failover up

3. Plumb up secondary interface

# ifconfig e1000g1 plumb netmask + broadcast + deprecated -failover group ipmp0 standby up

4. Set something same for persistent configuration

# cat /etc/hostname.e1000g0 netmask + broadcast + group ipmp0 up addif netmask + broadcast + deprecated -failover up
# cat /etc/hostname.e1000g1 netmask + broadcast + deprecated -failover group ipmp0 standby up

c) Test your redundancy

To detach(d) and re-attach(r) use

# if_mpadm -d e1000g0
# if_mpadm -r e1000g0

You may see that primary difference is only standby flag for secondary interface configuration. Other flags are:

deprecated - address can be used as test address for IPMP and not for any actual data transfer
-failover - does not failover when the interface fails

Well let's see ifconfig -a output to check interface state and flags. Remember, IPMP isn't aggregation and will not increase your network bandwith, but provides failover for your network address.




Copy partition table from one disk to another

Make sure your disks have same size, the issue:

# prtvtoc /dev/rdsk/c1t0d0s | fmthard -s - /dev/rdsk/c1t2d0s2


Create simple iSCSI non-zfs target (Solaris 10)

1. Select disk and define used slice

2. Create target

# iscsitadm create target -b /dev/dsk/<disk-N>s<slice-N> <target name>

Add new SAN LUN (simple iSCSI)

1. Add new discovery address if needed (leave port blank to add default 3260 port) and enable sendtargets method

# iscsiadm add discovery-address
# iscsiadm modify discovery --sendtargets enable

2. Then maintain your /dev/ namespace to make new device appears

# devfsadm -i iscsi

3. See your new LUN

# echo | format

Add new SAN LUN (for EMC)

This procedure is for EMC, but other should work as well.

1. Bind new LUN, assign it to storage group.

2. Rescan system for new devices:

[root@li-pdb /]# cfgadm -al
Ap_Id                          Type         Receptacle   Occupant     Condition
c0                             scsi-bus     connected    configured   unknown
c0::dsk/c0t0d0                 disk         connected    configured   unknown
c0::dsk/c0t1d0                 disk         connected    configured   unknown
c0::emcpsf0                    unknown      connected    configured   unknown
c1                             scsi-bus     connected    configured   unknown
c1::dsk/c1t0d0                 CD-ROM       connected    configured   unknown
c2                             fc-private   connected    configured   unknown
c2::5006016010210d72           disk         connected    configured   unknown
c3                             fc-private   connected    configured   unknown
c3::5006016810210d72           disk         connected    configured   unknown
usb0/1                         unknown      empty        unconfigured ok
usb1/1.1                       unknown      empty        unconfigured ok
usb1/1.2                       unknown      empty        unconfigured ok
usb1/1.3                       unknown      empty        unconfigured ok
usb1/1.4                       unknown      empty        unconfigured ok
usb1/2                         unknown      empty        unconfigured ok

2a. not sure if this is required

[root@li-pdb /]# devfsadm -c disk

3. Rescan emc devices

[root@li-pdb /]# powercf -q

Could not find config file entry for:
volume ID = 6006016062311200

adding emcpower14

[root@li-pdb /]# powermt config

4. Label new disk, should be string like emcpower14a. Ignore all 'Error occurred with device in use checking: No such device'

# format

Then select newly added disk, then run label

5. Add disk under control of VXVM

[root@li-pdb /]# vxdctl enable
[root@li-pdb /]# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
c0t0d0s2     auto:none       -            -            online invalid
c0t1d0s2     auto:none       -            -            online invalid
emcpower0s2  auto:cdsdisk    emcdisk19    pdb_dg04     online
emcpower1s2  auto:sliced     emcdisk03    pdb_dg01     online
emcpower2s2  auto:sliced     emcdisk12    pdb_dg01     online
emcpower3s2  auto:sliced     emcdisk10    pdb_dg02     online
emcpower4s2  auto:sliced     emcdisk02    pdb_dg01     online
emcpower5s2  auto:cdsdisk    emcdisk11    pdb_dg02     online
emcpower6s2  auto:sliced     emcdisk05    pdb_dg01     online
emcpower7s2  auto:sliced     emcdisk11    pdb_dg03     online
emcpower8s2  auto:sliced     emcdisk08    pdb_dg02     online
emcpower9s2  auto:sliced     emcdisk07    pdb_dg02     online
emcpower10s2 auto:sliced     emcdisk04    pdb_dg01     online
emcpower11s2 auto:sliced     emcdisk06    pdb_dg02     online
emcpower12s2 auto:sliced     emcdisk09    pdb_dg02     online
emcpower13s2 auto:sliced     emcdisk01    pdb_dg01     online
emcpower14s2 auto:none       -            -            online invalid

6. Further was done through gui (VEA):

Initialize new disk, select sliced. Add it to same group, move the volume to new disk, resize volume to fill up the whole disk.

7. Resize the filesystem

# /usr/lib/fs/vxfs/fsadm -F vxfs /plstai/applcsf

Solaris Volume Manager (Solstice Disksuite)

Update device relocation information

If you got errors in logs like:

Sep 20 05:17:43 storage metadevadm: [ID 209699 daemon.error] Invalid device relocation information detected in Solaris Volume Manager
Sep 20 05:17:43 storage metadevadm: [ID 912841 daemon.error] Please check the status of the following disk(s):
Sep 20 05:17:43 storage metadevadm: [ID 702911 daemon.error]    c1t2d0
Sep 20 05:17:43 storage metadevadm: [ID 702911 daemon.error]    c1t3d0
Sep 20 05:17:43 storage metadevadm: [ID 702911 daemon.error]    c1t4d0
Sep 20 05:17:43 storage metadevadm: [ID 702911 daemon.error]    c1t1d0

Then you may need to update your device relocation information, invoke:

# metadevadm -u c1t1d0
Updating Solaris Volume Manager device relocation information for c1t1d0
Old device reloc information:
New device reloc information:
# metastat -i

Do not forget to run metastat -i and see your device functionality

Cluster 3.3


Boot node in non-cluster mode


Enter boot -x at boot promt

For x86:

add -x at the end of $kernel command line

To boot in non-cluster mode after reboot issue:

# reboot -- -x

Solaris Cluster CLI Mapping

Original Oracle article describes CLI commands and their equivalents between different versions of Solaris Cluster:

Disable Cluster install mode

To check if your cluster still in install mode issue:

# cluster show -t global | grep installmode

To end installation of cluster you can follow two ways:

1. Issue clsetup command and follow questions step by step to finish with cluster installation

2. Issue following to manually disable install mode:

# cluster set -p installmode=disabled

Then check your cluster mode again.

Cluster check

Utility cluster check used for obtaining current cluster installation parameters and their value for compromising level of system risks or security problems. To check current cluster installation use:

# cluster check -v

-v for verbose output (will list all checks and their state)

To list all check use:

# cluster list-checks

To list checks by keywords (it looks like grouped by resource) use:

# cluster list-checks -K

To list checks for specified keyword use:

# cluster list-checks -k <keyword>

Getting details for a specified check:

# cluster list-checks -v -C <checkID>

Be careful with functional checks, the should be running after cluster configuration end, but before running any service on it. To list functional check use:

# cluster list-checks -k functional

Functional checks can use maintenance commands like shutdown or restart to take actions during reboot or cluster forming. Be carefull to run in on live system.

Here you can see one of functional check that shutting down the cluster.

# cluster list-checks -v -C F6984121
 F6984121: (Critical) Perform cluster shutdown.
Keywords: SolarisCluster3.x, functional
Applicability: Applicable if multi-node cluster running live.
Check Logic: Prompt for confirmation then execute '/usr/cluster/bin/scshutdown.' After all nodes have been halted and rebooted and the cluster has reformed, evaluate the post-reboot state.
Version: 1.4
Revision Date: 11/01/09
 cleaning up...

Then run cluster functional check

# cluster check -v -o /root/ -C F6984121

It will ask you for several questions then you decide to run it or not.

Shutdown cluster node

To properly shutdown cluster node you have to switch all running resource and device groups including non-global zones to another preferred server like this:

# clnode evacuate <node>

Then shutdown node:

# shutdown -g0 -y -i0

Create UFS clustered file system (global)

1. Define shared disk to use (it could be SCSI shared disk or even iSCSI)

2. Populate disk in cluster devices

# cldevice populate

and check device state

# cldevice status   

=== Cluster DID Devices ===

Device Instance            Node                 Status
---------------            ----                 ------
/dev/did/rdsk/d1           s10-sc33-1           Ok
                           s10-sc33-2           Ok 

/dev/did/rdsk/d2           s10-sc33-1           Ok
                           s10-sc33-2           Ok

/dev/did/rdsk/d3           s10-sc33-1           Ok
                           s10-sc33-2           Ok

then see device source (iSCSI LUN in our example)

# cldevice list -v  
DID Device          Full Device Path
----------          ----------------
d1                  s10-sc33-1:/dev/rdsk/c0d0
d1                  s10-sc33-2:/dev/rdsk/c0d0
d2                  s10-sc33-1:/dev/rdsk/c1t0d0
d2                  s10-sc33-2:/dev/rdsk/c1t0d0
d3                  s10-sc33-1:/dev/rdsk/c2t600144F05410CB3600080027255D5600d0
d3                  s10-sc33-2:/dev/rdsk/c2t600144F05410CB3600080027255D5600d0

We are interested in device d3

3. Create new UFS filesystem

# newfs /dev/global/rdsk/d3s2

(we already know disk slice since we partitioned this device on storage, in oher examples you can partition disk before populating with cldevice)

4. Create mount point and mount device from vfstab (pathname just for case if you decide separate global filesystems fron local by name, here I used device group name and device name itself)

# mkdir /global/group/d3
# vi /etc/vfstab

and put something like this

/dev/global/dsk/d3s2 /dev/global/rdsk/d3s2 /global/group/d3 ufs 2 yes global,logging

Options global and logging are necessary.

5. Check your vfstab with cluster utility

# cluster check -k vfstab

6. Mount your device from any cluster node

# mount /global/group/d3

7. See your SCSI/iSCSI device mounted on all nodes and available for read/write without corruptioning data.

Cluster data services

Configure HA Apache web server (as failover service) for Solaris cluster

There are classic and simple scenario for configuring Apache HTTPD 2.4 for HA on Solaris Cluster for running MediaWiki software with global mounted filesystem. Apache web server in this example was compiled from sources on Solaris 10.


By default Solaris 10 have no named make utility, it's have gmake. So if you got error make,gcc,g++ or ar not found just follow this:

  • include /usr/sfw/bin, /usr/xpg4/bin to your $PATH
  • make symlink /usr/bin/make with source /usr/sfw/make

There are may be any raised errors, just read script output and try to fix it.

1. Download httpd-2.4.10.tar.gz , pcre-8.35.tar.gz, apr-1.5.1.tar.gz , apr-util-1.5.3.tar.gz , php-5.6.0.tar.gz(or something similar)

2. Unpack httpd. Unpack apr and apr-util and move to httpd/srclib/ (rename apr-1.5.1 and apr-util-1.5.3 folders to apr and apr-util accordingly). Unpack pcre-8.35.

Note: Your global filesystem for Apache HA named /global/apache

3. Compile and install PCRE.

# cd /pcre-8.35
#./configure --enable-utf8 --prefix=/global/apache/pcre
# make
# make install

4. Compile and install Apache HTTPD

# ./configure --with-included-apr --enable-so --with-pcre=/global/apache/pcre --prefix=/global/apache/apache2 
# make
# make install

5. Compile and install PHP5

# ./configure --disable-cli --with-apxs2=/global/apache/apache2/bin/apxs --with-mysql --prefix=/global/apache/php5

Now you have simple Apache web server with php5 support, suggested to install one copy of Apache to HA filesystem like /global/apache/apache2 (available on both nodes at time, for example Global UFS or NFS)

Also, to set Apache HA data service for cluster as failover service you should have following resources:

  • logical hostname
  • network group to assign logical hostname and Apache data service to same group

Setting up service

a) If you don't have logical hostname let's add it.

mywebserver IP is

1. Add network group:

# clrg create network-rg

2. Add logical hostname:

# clreslogicalhostname create -g network-rg -h mywebserver network-lh-rs

3. Bring up resource group:

# clresourcegroup manage network-rg
# clresourcegroup online network-rg

b) Create apache data service

1. Add Apache data service

# clresource create -g network-rg -t SUNW.apache -p Bin_dir=/global/apache/apache2/bin -p Resource_dependencies=network-lh-rs -p Port_list=80/tcp -p Scalable=false apache-ds-rs

Options are:

  • -t - resource type SUNW.apache in our example (should be registered before adding resource)
  • -g - resource group to that Apache data service will be assigned (also logical hostname should be assigned to same group)
  • Bin_dir - apache/bin with apachectl utility
  • Resource_dependencies - network resources (logical hostnames) that Apache services will be used.
  • Port_list - used port list in format $port/$protocol, fo example 80/tcp,81/tcp
  • Scalable - option used for scalable Apache service, not applicable for failover

c) Check your service status

# clrs status

=== Cluster Resources === 

Resource Name      Node Name       State        Status Message
-------------      ---------       -----        --------------
apache-ds-rs       s10-sc33-2      Offline      Offline - Successfully stopped Apache Web Server.
                   s10-sc33-1      Online       Online - Service is online. 

network-lh-rs      s10-sc33-2      Offline      Offline - LogicalHostname offline.
                   s10-sc33-1      Online       Online - LogicalHostname online.

Apache HA as failover service for Cluster installation completed.

Now let's run Apache configure to run MediaWiki. Since some syntax changes in httpd.conf with Apache 2.4 you may need following minimun in your httpd.conf for successfully running of .php scripts:

Server root directory:

ServerRoot "/global/apache/apache2"
Listen 80

List of loaded modules:

LoadModule unixd_module modules/
LoadModule dir_module modules/
LoadModule authz_host_module modules/
LoadModule authz_core_module    modules/
LoadModule php5_module        modules/
LoadModule mime_module modules/
LoadModule alias_module modules/

Document root:

<Directory "/global/apache/apache2/htdocs/">
    Options Indexes FollowSymLinks
    AllowOverride None
    Require all granted
<Directory "/global/apache/apache2/htdocs/wiki/">
    Options Indexes FollowSymLinks
    AllowOverride None
    Require all granted

Your start page:

<IfModule dir_module>
    DirectoryIndex wiki/index.php

File extension prevention (last 2 blocks not necessary, just as example):

<FilesMatch "^\.htaccess">
    Require all denied
    Require all denied
<FilesMatch "^\.html">
    Require all granted
<FilesMatch "^\.php">
    Require all granted

Aliases (you may need to set it as here if you use CGI):

<IfModule alias_module>
   ScriptAlias /cgi-bin/ "/global/apache/php5/bin/"
<Directory "/global/apache/php5/bin">
    AllowOverride None
    Options None
    Require all granted

Application types and handlers (necessary block to run PHP5 scripts):

<IfModule mime_module>
    TypesConfig conf/mime.types 
    AddType application/x-compress .Z
    AddType application/x-gzip .gz .tgz
    AddType application/x-httpd-php .php 

    AddHandler cgi-script .cgi
    AddHandler application/x-httpd-php .php

It's just an example, most of directives already in httpd.conf by default, use it to verify if you have PHP problems. Before running Apache unatr your mediawiki archive to /global/apache/apache2/htdocs/wiki (index.php should be under wiki folder).

d) Run your service and configure wiki:

# clrs enable apache-ds-rs

Log on to your browser and point to this location

http://<your_network-lh-rs IP>/wiki/mw-config/index.php:

It will generate your LocalSettings.php. Then log on to your wiki:

http://<your_network-lh-rs IP>

Note: There is still no example to use failover mysql for this wiki, since SQLite is enough to use MediaWiki (with db files on global filesystem), we will not review this case there.

Cluster Resources

Create HA logical hostname

This topic is for creation procedure of HA logical hostname. Following means:

- have failover IP address, on one cluster node, that will be mapped to another if node failed.
- have a resource for this IP address
- resource included in resource group that already have been created
- you should have IPMP redundancy on each node to have actual failover without switching between nodes (not necessary but recommended)

Follow these steps:

1. Ensure that your logical hostname validated on all cluster nodes that share this resource (sc-demo is

# getent hosts sc-demo  sc-demo

Better if you have hostname resolved in your cluster network (with DNS or another name service)

2. Create resource group if have not been already created.

# clresourcegroup create network-rg

3. Create resource and assign it to appropriate group:

# clreslogicalhostname create -g network-rg -h sc-demo network-lh-rs

4. Bring online this resource group on each node that share this resource

# clresourcegroup manage network-rg
# clresourcegroup online network-rg

Now you can see network aliased IP address on your public network devices:

# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet netmask ff000000 
e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet netmask ffffff00 broadcast
        groupname sc_ipmp0
        ether 8:0:27:92:13:9b 
e1000g0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet netmask ff000000 broadcast
e1000g0:2: flags=1001040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,FIXEDMTU> mtu 1500 index 2
        inet netmask ffffff00 broadcast
        inet netmask 0 
        groupname sc_ipmp0
        ether 8:0:27:bd:21:7 
e1000g2: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 5
        inet netmask ffffffc0 broadcast
        ether 8:0:27:6a:15:13 
e1000g3: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 4
        inet netmask ffffffc0 broadcast
        ether 8:0:27:cd:d9:66 
clprivnet0: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 6
        inet netmask ffffff00 broadcast
        ether 0:0:0:0:0:1

Common problems

Cluster check fails

You may see following message during your first cluster check

# cluster check 
cacaocsc: unable to connect: Network address is presently unreachable 
cluster check: (C704199) unable to reach Common Agent Container

There are three common reasons for this: not having forward/reverse address lookup and security key issue. First, try to lookup your node addresses:

# nslookup s10-sc33-1

Name:   s10-sc33-1

# nslookup
Address:     name = s10-sc33-1.

Second, try to recreate your security keys as suggested by oracle.

Third, ensure you have rpcbind configured for Solaris Cluster.

Recover cluster after loosing quorum device

Sometimes your quorum device may lost connectivity with cluster during maintenance or corrupting hardware. You may have offline quorum device all the time after issue and got error:

[ID 832830 kern.warning] WARNING: CMM: Open failed for quorum device
[ID 980942 kern.notice] NOTICE: CMM: Cluster doesn't have operational quorum yet; waiting for quorum.

Issuing of

# clq enable d<N>


# clq remove d<N>

Will fail. What's next? Try to recover your quorum device. Following steps are:

1. Boot nodes in non-cluster mode:

2. Backup Cluster Infrastructure table:

# cd /etc/cluster/ccr/global
# cp infrastructure infrastructure.old

3. Remove all cluster.quorum_devices related entries and check:

# grep cluster.quorum_devices infrastructure

4. Recover infrastructure table

# /usr/cluster/lib/sc/ccradm recover -o infrastructure

5. Reboot nodes^

6. Check node status on both nodes:

# clnode status

7. Check cluster device status:

# cldevice status

8. Refresh and populate your device (or new attached device) again:

# cldevice refresh -n <node>
# cldevice populate

Check device status again and ensure it's online now

9. Check quorum devices list:

# clq list

10. Re-attach your quorum device:

# clq add d<N>
# clq status

It should be enough to recover your quorum back again in operational state.

Resource group is undergoing reconfiguration

There is a big set of common causes of this problem, but actually this can be provoked by insane and repeatedly starting process, to offline this resource group use:

# clrg quiesce -k resource_group
-k switch will kill insane process (like kill -9)

Use that only as a last resort if you can restart services after.