2015-06-18

Hey, folks. This is a guest post by Doug Schwabauer about using MPxIO and Veritas in Ops Center. It's well worth the read.

Enterprise Manager Ops Center has a rich feature set for Oracle VM Server for SPARC (LDOMs) technology. You can provision Control Domains, service domains, LDOM guests, control the size and properties of the guests, add networks and storage to guests, create Server Pools, etc. The foundation for many of these features is called a Storage Library. In Ops Center, a Storage Library can be Fiber Channel based, iSCSI based, or NAS based.

For FC and iSCSI-based LUNs, Ops Center only recognizes LUNS that are presented to the Solaris Host via MPxIO, the built in multipathing software for Solaris. LUNs that are presented via other MP software, such as Symantec DMP, EMC Power Path, etc., are not recognized.

Sometimes users do not have the option to use MPxIO, or may choose to not use MPxIO for other reasons, such as use cases where SCSI3 Persistent Reservations (SCSI3 PR) is required.

It is possible to have a mix of MPxIO-managed LUNs and other LUNs, and to mix and match the LDOM guests and storage libraries for different purposes.

Below are two such use cases. In both, the user wanted to
utilize Veritas Cluster Server (VCS) to cluster select
applications running within LDOM’s that are managed with Ops
Center. However, the cluster requires I/O fencing via SCSI3 PR.
In this scenario, storage devices CANNOT be presented via MPxIO
since SCSI3 PR requires direct access to storage from inside the
guest OS, and MPxIO does not facilitate that capability. Therefore, the user thought they were either going to have to
choose to not use Ops Center, or not use VCS.

We found and presented a "middle road" solution, where the user is
able to do both - use Ops Center for the majority of their LDOM/OVM
Server for SPARC environment, but still use Veritas Dynamic
Multi-Pathing (DMP) software to manage the disk devices used for the
data protected by the cluster.

In both use cases, the hardware is the same:

2 T4-2, each with 4 FC cards and 2 ports/card

Sun Storedge 6180 FC LUNs presented to both hosts

One Primary Service domain, and one Alternate Service Domain, a Root Complex domain

Each Service domain sees two of the 4 FC Cards.

See the following blog posts for more details on setting up Alternate Service Domains:

https://blogs.oracle.com/jsavit/entry/availability_best_practices_availability_using

https://blogs.oracle.com/jsavit/entry/availability_best_practices_example_configuring

In our environment, the primary domain owns the cards in Slots 4 and 6, and the alternate domain owns the cards in Slots 1 and 9.

(Refer to System Service Manual for System Schematics and bus/PCI layouts.)

A user can control what specific cards, and even ports on cards, use MPxIO, and which don't.

You can either enable MPxIO globally, and then just disable it on certain ports, or disable MPxIO globally, and then just enable it on certain ports. Either way will accomplish the same thing.

See the Enabling or Disabling Multipathing on a Per-Port Basis document for more information.

For example:

root@example:~# tail /etc/driver/drv/fp.conf
# "target-port-wwn,lun-list"
#
# To prevent LUNs 1 and 2 from being configured for target
# port 510000f010fd92a1 and target port 510000e012079df1, set:
#
# pwwn-lun-blacklist=
# "510000f010fd92a1,1,2",
# "510000e012079df1,1,2";
mpxio-disable="no";      <---------------------- Enable MPxIO globally
name="fp" parent="/pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0" port=0 mpxio-disable="yes";  <--- Disable on port

root@example:~# ls -l /dev/cfg
total 21
.
.
.
lrwxrwxrwx   1 root     root          60 Feb 13 12:51 c3 -> ../../devices/pci@400/pci@1/pci@0/pci@8/SUNW,qlc@0/fp@0,0:fc
lrwxrwxrwx   1 root     root          62 Feb 13 12:51 c4 -> ../../devices/pci@400/pci@1/pci@0/pci@8/SUNW,qlc@0,1/fp@0,0:fc
lrwxrwxrwx   1 root     root          60 Feb 13 12:51 c5 -> ../../devices/pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0/fp@0,0:fc
lrwxrwxrwx   1 root     root          62 Feb 13 12:51 c6 -> ../../devices/pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0,1/fp@0,0:fc
.
.
.

Therefore "c5" on the example host will not be using MPxIO.

Similar changes were made for the other 3 service domains.

Now, for the guest vdisks that will not use MPxIO, the backend devices used were just raw /dev/dsk names - no multi pathing software is involved.   You will see a mix of these below - italics use MPxIO, and non-italics do not:

VDS
NAME             LDOM             VOLUME         OPTIONS          MPGROUP        DEVICE
primary-vds0     primary          aa-guest2-vol0                  aa-guest2      /dev/rdsk/c0t60080E5000183F120000107754E60374d0s2
quorum1                                        /dev/dsk/c5t20140080E5184632d12s2
quorum2                                        /dev/dsk/c5t20140080E5184632d13s2
quorum3                                        /dev/dsk/c5t20140080E5184632d14s2
clusterdata1                                   /dev/dsk/c5t20140080E5184632d8s2
clusterdata2                                   /dev/dsk/c5t20140080E5184632d7s2
clusterdata3                                   /dev/dsk/c5t20140080E5184632d10s2
aa-guest3-vol0                  aa-guest3      /dev/dsk/c0t60080E5000183F120000138B5522A1C4d0s2

VDS
NAME             LDOM             VOLUME         OPTIONS          MPGROUP        DEVICE
alternate-vds0   example-a    aa-guest2-vol0                  aa-guest2      /dev/rdsk/c0t60080E5000183F120000107754E60374d0s2
clusterdata3                                   /dev/dsk/c3t20140080E5184632d10s2
clusterdata2                                   /dev/dsk/c3t20140080E5184632d7s2
clusterdata1                                   /dev/dsk/c3t20140080E5184632d8s2
quorum3                                        /dev/dsk/c3t20140080E5184632d14s2
quorum2                                        /dev/dsk/c3t20140080E5184632d13s2
quorum1                                        /dev/dsk/c3t20140080E5184632d12s2
aa-guest3-vol0                  aa-guest3      /dev/rdsk/c0t60080E5000183F120000138B5522A1C4d0s2

Here you can see in Ops Center what the Alternate Domain's virtual disk services look like:



From guest LDOM perspective, we see 12 data disks (c1d0 is the boot disk), which is really 2 paths to 6 LUNs - one path from Primary and one from Alternate:

AVAILABLE DISK SELECTIONS:
0. c1d0 <SUN-SUN_6180-0784-100.00GB>
/virtual-devices@100/channel-devices@200/disk@0
1. c1d1 <SUN-SUN_6180-0784-500.00MB>
/virtual-devices@100/channel-devices@200/disk@1
2. c1d2 <SUN-SUN_6180-0784-500.00MB>
/virtual-devices@100/channel-devices@200/disk@2
3. c1d3 <SUN-SUN_6180-0784-500.00MB>
/virtual-devices@100/channel-devices@200/disk@3
4. c1d4 <SUN-SUN_6180-0784-500.00MB>
/virtual-devices@100/channel-devices@200/disk@4
5. c1d5 <SUN-SUN_6180-0784-500.00MB>
/virtual-devices@100/channel-devices@200/disk@5
6. c1d6 <SUN-SUN_6180-0784 cyl 51198 alt 2 hd 64 sec 64>
/virtual-devices@100/channel-devices@200/disk@6
7. c1d7 <SUN-SUN_6180-0784 cyl 51198 alt 2 hd 64 sec 64>
/virtual-devices@100/channel-devices@200/disk@7
8. c1d8 <SUN-SUN_6180-0784 cyl 25598 alt 2 hd 64 sec 64>
/virtual-devices@100/channel-devices@200/disk@8
9. c1d9 <SUN-SUN_6180-0784 cyl 25598 alt 2 hd 64 sec 64>
/virtual-devices@100/channel-devices@200/disk@9
10. c1d10 <SUN-SUN_6180-0784 cyl 25598 alt 2 hd 64 sec 64>
/virtual-devices@100/channel-devices@200/disk@a
11. c1d11 <SUN-SUN_6180-0784 cyl 25598 alt 2 hd 64 sec 64>
/virtual-devices@100/channel-devices@200/disk@b
12. c1d12 <SUN-SUN_6180-0784-500.00MB>
/virtual-devices@100/channel-devices@200/disk@c

Again from Ops Center, you can click on the Storage tab of the guest, and see that the MPxIO-enabled LUN is known to be "Shared" by the hosts in Ops Center, while the other LUNs are not:



At this point, since VCS was going to be installed on the LDOM OS and a cluster built, the Veritas stack, including VxVM and VxDMP,  was enabled on the guest LDOMs to correlate the two paths from primary and alternate domains into one path.

For example:

root@aa-guest1:~# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
sun6180-0_0  auto:ZFS        -            -            ZFS
sun6180-0_1  auto:cdsdisk    sun6180-0_1  data_dg      online shared
sun6180-0_2  auto:cdsdisk    -            -            online
sun6180-0_3  auto:cdsdisk    -            -            online
sun6180-0_4  auto:cdsdisk    -            -            online
sun6180-0_5  auto:cdsdisk    sun6180-0_5  data_dg      online shared
sun6180-0_6  auto:cdsdisk    sun6180-0_6  data_dg      online shared

root@aa-guest1:~# vxdisk list sun6180-0_6
Device:    sun6180-0_6
devicetag: sun6180-0_6
type:      auto
clusterid: aa-guest
disk:      name=sun6180-0_6 id=1427384525.22.aa-guest1
group:     name=data_dg id=1427489834.14.aa-guest1
info:      format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags:     online ready private autoconfig shared autoimport imported
pubpaths:  block=/dev/vx/dmp/sun6180-0_6s2 char=/dev/vx/rdmp/sun6180-0_6s2
guid:      {a72e068c-d3ce-11e4-b9a0-00144ffe28bc}
udid:      SUN%5FSUN%5F6180%5F60080E5000184108000000004C2CF217%5F60080E5000184632000056B354E6025F
site:      -
version:   3.1
iosize:    min=512 (bytes) max=2048 (blocks)
public:    slice=2 offset=65792 len=104783616 disk_offset=0
private:   slice=2 offset=256 len=65536 disk_offset=0
update:    time=1430930621 seqno=0.15
ssb:       actual_seqno=0.0
headers:   0 240
configs:   count=1 len=48144
logs:      count=1 len=7296
Defined regions:
config   priv 000048-000239[000192]: copy=01 offset=000000 enabled
config   priv 000256-048207[047952]: copy=01 offset=000192 enabled
log      priv 048208-055503[007296]: copy=01 offset=000000 enabled
lockrgn  priv 055504-055647[000144]: part=00 offset=000000
Multipathing information:
numpaths:   2
c1d11s2         state=enabled   type=secondary
c1d9s2          state=enabled   type=secondary
connectivity: aa-guest1 aa-guest2

root@aa-guest1:~# vxdmpadm getsubpaths dmpnodename=sun6180-0_6
NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME          ENCLR-TYPE   ENCLR-NAME    ATTRS
========================================================================================
c1d11s2      ENABLED(A)  SECONDARY    c1                 SUN6180-     sun6180-0        -
c1d9s2       ENABLED(A)  SECONDARY    c1                 SUN6180-     sun6180-0        -

In this way, the 2 guests that were going to be clustered together are now ready for VCS installation and configuration.

The second use case changes a little bit in that both MPxIO and Veritas DMP are used in the Primary and Alternate domains, and DMP is still used in the guest as well.   The advantage of this is there is more redundancy and I/O throughput available at the service domain level, because a multi-pathed devices are used for the guest virtual disk services, instead of just the raw /dev/dsk/c#t#d#.

Now the disk services looks something like this, where the italics vdsdevs are DMP-based, and the non-italic ones are DMP based:

VDS
NAME             LDOM             VOLUME         OPTIONS          MPGROUP        DEVICE
primary-vds0     primary          aa-guest2-vol0                  aa-guest2      /dev/rdsk/c0t60080E5000183F120000107754E60374d0s2
aa-guest3-vol0                  aa-guest3      /dev/dsk/c0t60080E5000183F120000138B5522A1C4d0s2
quorum1                                        /dev/vx/dmp/sun6180-0_6s2
quorum2                                        /dev/vx/dmp/sun6180-0_7s2
quorum3                                        /dev/vx/dmp/sun6180-0_8s2
clusterdata1                                   /dev/vx/dmp/sun6180-0_12s2
clusterdata2                                   /dev/vx/dmp/sun6180-0_5s2
clusterdata3                                   /dev/vx/dmp/sun6180-0_14s2

VDS
NAME             LDOM             VOLUME         OPTIONS          MPGROUP        DEVICE
alternate-vds0   example-a    aa-guest2-vol0                  aa-guest2      /dev/rdsk/c0t60080E5000183F120000107754E60374d0s2
aa-guest3-vol0                  aa-guest3      /dev/rdsk/c0t60080E5000183F120000138B5522A1C4d0s2
quorum1                                        /dev/vx/dmp/sun6180-0_6s2
quorum2                                        /dev/vx/dmp/sun6180-0_7s2
quorum3                                        /dev/vx/dmp/sun6180-0_8s2
clusterdata1                                   /dev/vx/dmp/sun6180-0_12s2
clusterdata2                                   /dev/vx/dmp/sun6180-0_5s2
clusterdata3                                   /dev/vx/dmp/sun6180-0_14s2

Again, the advantage here is that 2 paths to the same LUN are being presented from each service domain, so there is additional redundancy and throughput available. You can see the two paths:

root@example:~# vxdisk list sun6180-0_5
Device:    sun6180-0_5
devicetag: sun6180-0_5
type:      auto
clusterid: aa-guest
disk:      name= id=1427384268.11.aa-guest1
group:     name=data_dg id=1427489834.14.aa-guest1
info:      format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags:     online ready private autoconfig shared autoimport
pubpaths:  block=/dev/vx/dmp/sun6180-0_5s2 char=/dev/vx/rdmp/sun6180-0_5s2
guid:      {0e62396e-d3ce-11e4-b9a0-00144ffe28bc}
udid:      SUN%5FSUN%5F6180%5F60080E5000184108000000004C2CF217%5F60080E5000183F120000107F54E603CA
site:      -
version:   3.1
iosize:    min=512 (bytes) max=2048 (blocks)
public:    slice=2 offset=65792 len=104783616 disk_offset=0
private:   slice=2 offset=256 len=65536 disk_offset=0
update:    time=1430930621 seqno=0.15
ssb:       actual_seqno=0.0
headers:   0 240
configs:   count=1 len=48144
logs:      count=1 len=7296
Defined regions:
config   priv 000048-000239[000192]: copy=01 offset=000000 enabled
config   priv 000256-048207[047952]: copy=01 offset=000192 enabled
log      priv 048208-055503[007296]: copy=01 offset=000000 enabled
lockrgn  priv 055504-055647[000144]: part=00 offset=000000
Multipathing information:
numpaths:   2
c5t20140080E5184632d7s2 state=enabled   type=primary
c5t20250080E5184632d7s2 state=enabled   type=secondary

The Ops Center view of the virtual disk services is much the same:



Now the cluster can be set up just as it was before. To the guest, the virtual disks have not changed - just the back-end presentation of the LUNs has changed. This was transparent to the guest.

Show more