2014-12-11

This article is intended to help a new or experienced Oracle Solaris user to quickly and easily install and configure Oracle Solaris Cluster software for two nodes, including the creation of Single Root I/O Virtualization/InfiniBand (SR-IOV/IB) devices. It provides a step-by-step procedure to simplify the process.

This article does not cover the configuration of highly available services. For more details on how to install and configure other Oracle Solaris Cluster software configurations, see the Oracle Solaris Cluster Software Installation Guide.

This articles uses the interactive scinstall utility to configure all the nodes of a cluster quickly and easily. The interactive scinstall utility is menu driven. The menus help reduce the chance of mistakes and promote best practices by using default values and prompting you for information specific to your cluster. The utility also helps prevent mistakes by identifying invalid entries. Finally, the scinstall utility eliminates the need to manually set up a quorum device by automating the configuration of a quorum device for your new cluster.

Note: This article applies to the Oracle Solaris Cluster 4.1 release. For more information about the Oracle Solaris Cluster release, see the Oracle Solaris Cluster 4.1 Release Notes.

Overview of SR-IOV

SR-IOV is a PCI-SIG standards-based I/O virtualization specification. SR-IOV enables a PCIe function known as physical function (PF) to create multiple lightweight PCIe functions known as virtual functions (VFs). VFs show up like a regular PCIe functions and also operate like regular PCIe functions. The address space for a VF is well contained so that a VF can be assigned to a virtual machine (a logical domain or LDom) with the help of a hypervisor. SR-IOV provides a high degree of sharing compared to other forms of direct hardware access methods that are available in LDom technology, that is, PCIe bus assignment and direct I/O.

Prerequisites, Assumptions, and Defaults

This section discusses several prerequisites, assumptions, and defaults for two-node clusters.

Configuration Assumptions

This article assumes the following configuration is used:

You are installing the two-node cluster on Oracle Solaris 11.1 and you have basic system administration skills.

You are installing the Oracle Solaris Cluster 4.1 software.

The cluster hardware is a supported configuration for Oracle Solaris Cluster 4.1 software.

This is a two-node cluster for SPARC T4-4 servers from Oracle. SR-IOV is only supported on servers based on Oracle's SPARC T4 (or above) processors.

Each cluster node is an I/O domain.

Each node has two spare network interfaces to be used as private interconnects, also known as transports, and at least one network interface that is connected to the public network.

iSCSI shared storage is connected to the two nodes.

Your setup looks like Figure 1. You might have fewer or more devices, depending on your system or network configuration.

In addition, it is recommended that you have console access to the nodes during cluster installation, but this is not required.


Figure 1. Oracle Solaris Cluster Hardware Configuration

Prerequisites

Perform the following prerequisite tasks:

Ensure that Oracle Solaris 11.1 SRU13 is installed on both the SPARC T4-4 systems.

Perform the initial preparation of public IP addresses and logical host names.

You must have the logical names (host names) and IP addresses of the nodes to configure a cluster. Add those entries to each node's /etc/inet/hosts file or to a naming service if such a service (for example, DNS, NIS, or NIS+ maps) is used. The example in this article uses a NIS service.
Table 1 lists the configuration used in this example.
Table 1. Configuration

Component

Name

Interface

IP Address

cluster

phys-schost





node 1

phys-schost-1

igbvf0

1.2.3.4

node 2

phys-schost-2

igbvf0

1.2.3.5

Create SR-IOV VF devices for the public, private, and storage networks.

You have to create the VF devices on the corresponding adapters for public, private, and storage networks in the primary domain and assign the VF devices to the logical domains that will be configured as cluster nodes.
Type the commands shown in Listing 1 on the control domain phys-primary-1:

Listing 1
The VF IOVNET.PF0.VF1 is used for the public network. IB VF devices have partitions that host both private network and storage network devices.
Repeat the commands shown in Listing 1 on phys-primary-2. The I/O domain domain1 on both nodes must be installed with Oracle Solaris 11.1 SRU13 before installing the cluster software.

Note: To learn more about SR-IOV technology, take a look at the documentation for Oracle VM Server for SPARC 3.1. For information about InfiniBand VFs, see "Using InfiniBand SR-IOV Virtual Functions."

Defaults

The scinstall interactive utility in the Typical mode installs the Oracle Solaris Cluster software with the following defaults:

Private-network address 172.16.0.0

Private-network netmask 255.255.248.0

Cluster-transport switches switch1 and switch2

Perform the Preinstallation Checks

Temporarily enable rsh or ssh access for root on the cluster nodes.

Log in to the cluster node on which you are installing Oracle Solaris Cluster software and become superuser.

On each node, verify the /etc/inet/hosts file entries. If no other name resolution service is available, add the name and IP address of the other node to this file.

The /etc/inet/hosts file on node 1 has the following information.

The /etc/inet/hosts file on node 2 has the following information.

On each node, verify that at least one shared storage disk is available.

In this example, the following disks are shared between the two nodes: c0t600A0B800026FD7C000019B149CCCFAEd0 and c0t600A0B800026FD7C000019D549D0A500d0.

On each node, ensure that the right OS version is installed.

Ensure that the network interfaces are configured as static IP addresses (not DHCP or of type addrconf, as displayed by the ipadm show-addr -o all command.)

If the network interfaces are not configured as static IP addresses, then run the command shown in Listing 2 on each node, which will unconfigure all network interfaces and services.
If the nodes are configured as static, go to the "Configure the Oracle Solaris Cluster Publisher" section.

Listing 2

On each node, type the following commands to configure the naming services and update the name service switch configuration:

Bind each node to the NIS server.

Reboot each node to make sure that the new network setup is working fine.

Configure the Oracle Solaris Cluster Publisher

There are two main ways to access the Oracle Solaris Cluster package repository, depending on whether the cluster nodes have direct access (or through a web proxy) to the internet: using a repository hosted on pkg.oracle.com or using a local copy of the repository.

Using a Repository Hosted on pkg.oracle.com

To access either the Oracle Cluster Solaris Release or Support repositories, obtain the SSL public and private keys.

Go to http://pkg-register.oracle.com.

Choose the Oracle Solaris Cluster Release or Support repository.

Accept the license.

Request a new certificate by choosing the Oracle Solaris Cluster software and submitting a request. This displays a certification page that contains download buttons for download the key and certificate files.

Download the key and certificate files and install them, as described in the returned certification page.

Configure the ha-cluster publisher with the downloaded SSL keys to point to the selected repository URL on pkg.oracle.com.

This example uses the release repository:

Using a Local Copy of the Repository

To access a local copy of the Oracle Solaris Cluster Release or Support repository, download the repository image.

Download the repository image from the Oracle Technology Network or Oracle Software Delivery Cloud. To download the repository image from Oracle Software Delivery Cloud, select Oracle Solaris as the Product Pack on the Media Pack Search Page.

Mount the repository image and copy the data to a shared file system that all the cluster nodes can access.

Configure the ha-cluster publisher.

This example uses node 1 as the system that shared the local copy of the repository:

Install the Oracle Solaris Cluster Software Packages

On each node, ensure that the correct Oracle Solaris package repositories are published.

If they are not, unset the incorrect publishers and set the correct ones. The installation of the ha-cluster packages is highly likely to fail if it cannot access the solaris publisher.

On each cluster node, install the ha-cluster-full package group.

Configure the Oracle Solaris Cluster Software

On each node of the cluster, identify the network interfaces that will be used for the private interconnects.

In this example, 8513 and 8514 are the PKEYs for a private IB partition that is used for transport. 8503 is the PKEY for a private storage network that is used to configure iSCSI storage from an Oracle ZFS Storage Appliance with an IB connection.
The Oracle ZFS Storage Appliance has the IP address 192.168.0.61 configured on the InfiniBand network. The priv1 and priv2 IB partitions are used as private interconnects for the private network. The storage1 and storage2 partitions are used for the storage network.
Type the following commands on node 1:

Type the following commands on node 2:

On each node, ensure that the Oracle Solaris Service Management Facility services are not in the maintenance state.

On each node, ensure that the service network/rpc/bind:default has the local_only configuration set to false.

If not, set the local_only configuration to false.

From one of the nodes, start the Oracle Solaris Cluster configuration. This will configure the software on the other node as well.

In this example, the following command is run on the node 2, phys-schost-2.

From the Main menu, type 1 to choose the first menu item, which can be used to create a new cluster or add a cluster node.

Answer yes and then press Enter to go to the installation mode selection. Then select the default mode: Typical.

Provide the name of the cluster. In this example, type the cluster name as phys-schost.

Provide the name of the other node. In this example, the name of the other node is phys-schost-1. Finish the list by pressing ^D. Answer yes to confirm the list of nodes.

The next two screens configure the cluster's private interconnects, also known as the transport adapters. Select the priv1 and priv2 IB partitions.

The next screen configures the quorum device. Select the default answers for the questions asked in the Quorum Configuration screen.

The final screens print details about the configuration of the nodes and the installation log's file name. The utility then reboots each node in cluster mode.

When the scinstall utility finishes, the installation and configuration of the basic Oracle Solaris Cluster software is complete. The cluster is now ready for you to configure the components you will use to support highly available applications. These cluster components can include device groups, cluster file systems, highly available local file systems, and individual data services and zone clusters. To configure these components, refer to the Oracle Solaris Cluster 4.1 documentation library.

Verify on each node that multiuser services for the Oracle Solaris Service Management Facility (SMF) are online. Ensure that the new services added by Oracle Solaris Cluster are all online.

From one of the nodes, verify that both nodes have joined the cluster.

Verify High Availability (Optional)

This section describes how to create a failover resource group with a LogicalHostname resource for a highly available network resource and an HAStoragePlus resource for a highly available ZFS file system on a zpool resource.

Identify the network address that will be used for this purpose and add it to the /etc/inet/hosts file on the nodes. In this example, the host name is schost-lh.

The /etc/inet/hosts file on node 1 contains the following information:

The /etc/inet/hosts file on node 2 contains the following information:

schost-lh will be used as the logical host name for the resource group in this example. This resource is of the type SUNW.LogicalHostname, which is a preregistered resource type.

From one of the nodes, create a zpool with the two shared storage disks /dev/did/rdsk/d1s0 and /dev/did/rdsk/d2s0. In this example, the entire disk is assigned to slice 0 of the disks, using the format utility.

The created zpool will now be placed in a highly available resource group as a resource of type SUNW.HAStoragePlus. This resource type has to be registered before it is used for the first time.

To create a highly available resource group to house the resources, on one node, type the following command:

Add the network resource to the test-rg group.

Register the storage resource type.

Add the zpool to the group.

Bring the group online:

Check the status of the group and the resources:

The command output shows that the resources and the group are online on node 1.

To verify availability, switch over the resource group to node 2 and check the status of the resources and the group.

Show more