Netmail Store cluster nodes are controlled through the SNMP action commands. These commands provide a mechanism through which nodes and volumes within nodes can be taken down for service or retired from a Netmail Store cluster.
Shutdown Action for Nodes
In order to gracefully shutdown a Netmail Store node, the string “shutdown” is written to the “castorShutdownAction” OID. Similarly, writing the string “reboot” to this OID will cause a Netmail Store node to reboot.
Upon receipt of a shutdown or reboot value, the node will initiate a graceful stop by unmounting all of its volumes and removing itself from the cluster. For a shutdown, the node will be powered off if the hardware supports this. For a reboot, the node will reboot the machine, re-read the node and/or cluster configuration files and startup Netmail Store. A graceful node stop is necessary in order to reboot quickly. If a node stops ungracefully, it will be required to perform consistency checks on all its volumes before it can rejoin the cluster. Before shutting down or rebooting, a node’s status page or the SNMP “castorErrTable” OID should be checked for critical error messages. Any critical messages logged there will be cleared upon reboot.
Retire Action for Nodes and Volumes
The “retire” action is used to permanently remove a node or a volume within a node from the cluster. The retirement of a node may be done when the node reaches the end of its useful service life. A volume may be retired if there is a problem with the storage hardware and it is to be removed from a node.
Note: Retire is not tuned for fast completion. Completing a retire action requires at least three health processor cycles.
When a volume is retired, all of the streams stored on it are moved to other nodes within the Netmail Store cluster. Once the retirement of a volume is initiated, it becomes read only and no additional streams will be stored on it. After all of the streams have been moved elsewhere in the cluster, the volume is idled and no further read/write requests will be made to it. Each volume is given a unique name within its node – the device string from the “vols” line of the configuration file. In order to retire a volume, its name is written as a string to the “castorRetireAction” OID. The volume retirement process is initiated immediately upon receipt and the action cannot be aborted once started.
The retirement of a node involves the retirement of all of its volumes. Once all of the node's volumes have been retired and all of its data has been copied elsewhere in the cluster, the node is permanently out of service and will not respond to further requests. In order to retire a node and all of its volumes, the string “all” is written to the “castorRetireAction” OID. The node retirement process is initiated immediately upon receipt and the action cannot be aborted once started.
Warning: Care must be taken to assure that a cluster has both enough free space and nodes in order to handle the streams on a retiring volume. When subclusters are in use, both requirements apply to the subcluster where the retiring volume resides. If the number of nodes in the cluster/subcluster is insufficient to allow at least 2 replicas of all streams, the retiring node will be unable to complete the retirement until additional nodes are added. Retire does not require that the configured minreps is maintained to complete retirement but will log messages about not being able to create sufficient replicas if there are not enough nodes to maintain minreps.
Note: For more information about SNMP and Object Identifiers (OIDs), see SNMP Management. The SNMP MIB definition file for Netmail Store is located on the USB flash device. If installing Netmail Store from a CSN, an aggregate Management Information Base (MIB) for the entire cluster is available, allowing administrators to monitor the cluster from the external network.