Getting to know the Solaris iSCSI stack article

I have an article titled getting to know the Solaris iSCSI stack in the August issue of SysAdmin magazine. If you are interested in learning more about iSCSI and the Solaris iSCSI stack, you might be interested in this article. I would like to thank Adam Leventhal for proofreading the article for me. Adam is a super sharp cat, and definitely one of the coolest kernel egineers I have ever met!!

Deploying highly available Oracle databases with Sun Cluster 3.2

While preparing for my Sun cluster 3.2 exam, I got a chance to play with a number of the Sun Cluster 3.2 data services. One of my favorite data services was the Oracle HA data service, which allows Sun cluster to monitor and failover databases in response to system and application failures.

Configuring the Oracle HA data service is amazingly easy, and it took me all of about 5 minutes (plus two hours reading through the Oracle data service documentation, installing Oracle and creating a database). Here are the steps I used to configure Sun Cluster 3.2 to failover an Oracle 10G database between two nodes:

Step 1. Run sccheck

The sccheck utility will verify the cluster configuration, and report errors it finds. This is useful for detecting storage and network misconfigurations early on in the configuration process. To use sccheck, you can run it without any options:

$ sccheck

If sccheck reports errors, make sure to correct them prior to moving on.

Step 2. Create a resource group for the Oracle resources

The Oracle resource group is a container for individual resources, which include virtual IP addresses, disk storage, and resources to manage the Oracle database and listener. To create a new resource group, you can run the clresourcegroup utility with the “create” option and the name of the resource group to create:

$ clresourcegroup create oracleha-rg

After the resource group is created, you will also need to put the resource group into a managed state with the clresourcegroup manage option (you can also set the resource group to the managed state when you create it):

$ clresourcegroup manage oracleha-rg

Once the resource group is in a managed state, you can bring it online with the clresourcegroup “online” option:

$ clrg online oracleha-rg

To verify that the resource group is online, you can run the clresourcegroup command with the “status” option and the name of the resource group to check (if you don’t pass a resource group name as an argument, the status of all resource groups is displayed):

$ clrg status oracleha-rg

=== Cluster Resource Groups ===

Group Name        Node Name      Suspended      Status
----------        ---------      ---------      ------
oracleha-rg       snode1         No             Offline
                      snode2         No             Online

Step 3. Register resource types with the cluster framework

Sun cluster comes with a number of resource types, which provide the brains needed to start, stop and monitor a specific type of resource (e.g., storage, virtual IP addresses, Oracle listeners, zones, etc.). To make these resource types available to the cluster framework, you need to use the clresourcetype utility to register them. The following commands will register the HAStoragePlus, oracle_server and oracle_listener resources with the cluster:

$ clresourcetype register SUNW.oracle_HAStoragePlus

$ clresourcetype register SUNW.oracle_listener

$ clresourcetype register SUNW.oracle_server

To check if a resource type is registered with the cluster framework, you can run the clresourcetype utility with the “list” option:

$ clresourcetype list

SUNW.LogicalHostname:2
SUNW.SharedAddress:2
SUNW.derby
SUNW.sctelemetry
SUNW.HAStoragePlus:4
SUNW.oracle_listener:5
SUNW.oracle_server:6

Step 4. Configure HAStoragePlus resource

The Oracle database will store datafiles and controlfiles on some type of shared storage. If the node that is running the database fails, the cluster framework will migrate the shared storage to another cluster member when the database is brought online on an alternate cluster node. To facilitate the storage migration, an HAStoragePlus resource needs to be configured with the shared storage resources. In addition to mounting and unmounting file systems, the HAStoragePlus resource will also monitor the health of the storage.

To create an HAStoragePlus resource that controls the ZFS pool named oracle, the clresource command can be executed with the “create” command, the “-t” option and the type of resource to create, the “-g” option and the name of the resource group to assign the resource to, the “Zpools” option and the name of the ZFS pool to manage, an affinity value to ensure that the storage will be online on the same node as the other resources in the resource group and the name of the resource to create:

$ clresource create -t SUNW.HAStoragePlus \
-g oracleha-rg \
-p Zpools=oracle \
-p AffinityOn=True oracledisk-res

To verify that the ZFS pool is imported and the file systems are online, you can run df to view the file systems, and the clresource utility with the “status” option to view the state of the HAStoragePlus resources:

$ df -h | egrep ‘(File|oracle)’

Filesystem             size   used  avail capacity  Mounted on
oracle                  35G    24K    35G     1%    /u01
oracle/u02              35G    24K    35G     1%    /u02
oracle/u03              35G    24K    35G     1%    /u03

$ clresource status oracledisk-res

=== Cluster Resources ===

Resource Name  Node Name   State       Status Message
-------------       ---------        -----        --------------
oracledisk-res    snode1         Offline      Offline
                       snode2         Online      Online

If the file systems don’t mount for some reason, you can check /var/adm/messages to see why.

Step 5. Configure a logical hostname resource

In order for clients to automatically adapt to database failures, they will need to be smart enough to reconnect to the database, and will also need to be configured to connect to a virtual IP address that is configured on the active database node. Virtual IP addresses are managed by the Sun cluster LogicalHostname resource type (you can also configure Scalable addresses that are mastered on more than one node, but that is a topic for a subsequent blog entry). To configure a LogicalHostname resource, you will first need to add the virtual IP address and the name associated with that address to /etc/hosts on each cluster node. After the entry is added, the clreslogicalhostname utility can be run with the “create” option, the “-g” option and the resource group to assign the logical hostname to, the “-h” option and the hostname associated with the virtual IP and a name to assign to the resource:

$ clreslogicalhostname create -g oracleha-rg -h oracle-lh oraclelh-res

To verify that the LogicalHostname resource is online, you can run ifconfig to see if the virtual IP address is configured on one of the interfaces:

$ grep oracle-lh /etc/hosts
192.168.1.32 oracle-lh

$ ifconfig -a | grep 192.168.1.32
inet 192.168.1.32 netmask ffffff00 broadcast 192.168.1.255

Step 6. Configure Oracle monitoring user

For the cluster framework to detect failures with Oracle (e.g., an Oracle internal error), an Oracle user needs to be created. This user is used by the Oracle data service to login to the database and check the operational status of the database. There are two ways the oracle data service checks to see if the Oracle database is working correctly. The first method used is a SQL select statement against the v$archive_dest and the v$sysstat views. If the values the agent retrieves from these views don’t change between consecutive runs, the agent will attempt to create a table to force the database to perform some work. For the Oracle data service to function correctly, an Oracle user needs to be created and assigned privileges to perform these operations.

To create a user named monitor that can create a table in the users tablespace and query the two views listed above, the following SQL can be used:

$ sqlplus “/ as sysdba”
sql> grant connect, resource to monitor identified by COMPLEXPASSWORD;
sql> alter user monitor default tablespace users quota 1m on users;
sql> grant select on v_$sysstat to monitor;
sql> grant select on v_$archive_dest to monitor;
sql> grant create session to monitor;
sql> grant create table to monitor;

To verify that the monitoring user works, you can connect to the database as the monitoring user:

$ sqlplus “monitor/COMPLEXPASSWORD@proddb”

Once connected, you can query both system views, and then attempt to create a table in the user’s default tablespace (in this case, the user is configured to use the users tablespace):

sql> select * from v$sysstat;

sql> select * from v$archive_dest;

sql> create table foo (a number);

sql> drop table foo;

Step 7. Create Oracle server and listener resources

After the file systems and IP addresses are online and the monitoring user is created, oracle_server and oracle_listener resources need to be created. The oracle_server resource is used to stop, start and monitor an Oracle database, and the oracle_listener resource is used to stop, start and monitor the Oracle listener.

To create a new oracle_server resource, the clresource utility can be run with the “create” option and a list of properties (e.g., where the alert log is located, the name of a user to monitor the database, the ORACLE_SID, the ORACLE_HOME, etc) that are needed to start and stop the database:

$ clresource create -g oracleha-rg \
-t SUNW.oracle_server \
-p Connect_string=monitor/COMPLEXPASSWORD \
-p ORACLE_SID=proddb \
-p ORACLE_HOME=/opt/oracle/product/10.2.0.1.0 \
-p Alert_log_file=/opt/oracle/admin/proddb/bdump/alert_proddb.log \
-p resource_dependencies=oracledisk-res oracledb-res

To create an oracle_listener resource, the clresource utility can be run with the “create” option and a list of properties (e.g., listener name, where the ORACLE_HOME resides, etc.) that are needed to start and stop the listener:

$ clresource create -g oracleha-rg \
-t SUNW.oracle_listener \
-p LISTENER_NAME=LISTENER_PRODDB \
-p ORACLE_HOME=/opt/oracle/product/10.2.0.1.0 \
-p resource_dependencies=oracledisk-res,oraclelh-res oraclelistener-res

There is one important thing to note in the commands listed above. Each resource contains a resource dependency, which the cluster framework uses to bring resource online in a specific order. The oracle_server needs the oraceldisk-res to be online prior to starting, since the database won’t be able to start unless the file systems are mounted. The oracle_listener also has a dependency on the oracle-disk-res resource, as well as a dependency on the logical host resource (this ensures that the virtual IP is up and operational prior to starting the listener). To verify that the resources are online and working, the clresource command can be run with the “status” option and the name of a resource:

$ clresource status oracledb-res

=== Cluster Resources ===

Resource Name       Node Name      State        Status Message
-------------       ---------      -----        --------------
oracledb-res        snode1         Offline      Offline
                    snode2         Online       Online

$ clrs status oraclelistener-res

=== Cluster Resources ===

Resource Name          Node Name     State      Status Message
-------------          ---------     -----      --------------
oraclelistener-res     snode1        Offline    Offline
                       snode2        Online     Online

Step 8: Verify the cluster configuration

Once all of the resources are created and the dependencies configured, you should verify that the Oracle database can successfully fail over to each node that is configured in the resource group’s node list. This can be accomplished by running the clresourcegroup utility with the “switch” option, the “-n” option and the name of the node to fail the resource group to, and the name of the resource group to failover:

$ clresourcegroup switch -n snode1 oracleha-rg

If you are able to successfully migrate the database to each node in the cluster, you now have a highly available Oracle database. While I touched on the basics of deploying a highly available Oracle database, I didn’t touch on correctly configuring the cluster framework, or the steps required to achieve maximum availability. I highly recommend reading through the Sun cluster concepts and Oracle data service guides prior to setting up your highly available databases. While the steps listed above work flawlessly in the authors environment, you should verify everything in a test environment prior to deploying anything in production.

Concert review: Tool

Over the past two years, I have grown to enjoy the music from the band Tool. Their music was truly revolutionary when they originally hit the music scene, and they are known for putting on one of the best live shows. A few weeks back I got a chance to see Tool live for the first time. Tool came on stage around 9:20pm, and opened the show with “Jambi.” This is a wicked cool tune, and like each song they played, it sounded absolutely amazing!!!! After Jambi, the band played “Stinkfist.” This tune also sounded incredible live, and it was amazing how the band sat back and let the music speak to the crowd.

Once the band finished up with Stinkfist, they played “46 & 2,” “Schism,” “Pushit,” “Lateralus,” “Lost Keys” and “Vicarious.” Each song was accompanied by a video that was playing on a number of monitors, as well as one of the coolest laser light shows I have ever seen. It’s a bummer the band didn’t play “Sober,” “The Pot” or some of their older hits, but the tunes they did play sounded magnificent! I have seen a LOT of shows in my day, and there aren’t many bands that sound better live than they do on CD. Tool is one such band, and I think this show was the best concert I have ever attended. Hopefully I can see them again in the near future!