Netmail Archive 5.x
Along with the new version of Archive 6.x a new indexing technology was introduced: Solr. It is a lucene-based product which runs on CentOS (linux) and supplants the older Exalead technology used in Archive 5.x. However there is no direct index conversion process; the only way to create the Solr indexes for your deployment is to re-read all the data (otherwise known as "re-indexing"). Below is the process devised by Netmail to accomplish this with minimal impact to your users.
Consider an initial Archive 5.x cluster deployed with:
- multiple Archive servers (master + workers)
- a dedicated Netmail Search server
- multiple indexers (Exalead type)
Your installation need not be this sophisticated (it could be 1x Archive with 1x Index) but the process remains the same.
The first step is to upgrade the existing Archive/Search servers to version 6.x. This is done in-place, so there will be a brief downtime as the installer runs and the services are restarted. When this completes the servers will resume operations using the existing indexers (Exalead).
NB: This state is meant to be temporary. Basic jobs & end-user access are available to bridge the transition to the next stage. However not all functionality is tested nor supported in this scenario, thus it is not an option to remain in this configuration long term.
With the production system now running 6.x and servicing end-users, we can continue to deploy the new servers.
We start by joining the new indexers (Solr) to the production cluster. They will not be implicated in any production operations for the moment.
Once the new production indexers are in place, we must feed them data so that they can build their data structures. The recommended way to do this is with an extra set of Archive servers to run index jobs. These Archive servers are not part of the production cluster and will only exist for the duration of the upgrade.
The servers are VM images you can download from Netmail. Assume 8 cores & 8 GB RAM per server. The more of these you deploy, the greater the amount of data that can be processed at once, and the sooner this stage will complete.
Using the temporary Archive servers run the indexing jobs to send the archive data to the new indexers. This can take days to weeks, depending on many factors in the environment:
- Size of the archives
- Throughput to the source data
- Network bandwidth between servers
- Storage I/O on the indexers
- Whether indexing of the attachments is required
NB: The index jobs run during this phase will only read the existing archives and send it for indexing. There is no data conversion involved.
With the re-index jobs completed, the indexes on the new/Solr servers should now match the indexes on the old/Exalead servers. The new servers are ready to go into production. This will also involve a brief downtime as the services are reconfigured & connected to the new targets.
NB: It is recommended to perform some spot-checking in Netmail Search to ensure that # results are the same before/after the switch. If required, the switch can be reversed and re-indexing repeated until results are satisfactory.
After testing that jobs execute successfully & archive visibility is restored, it is time to decommission the old indexers & temporary archivers.
The upgrade is now complete. Production cluster is fully on 6.x and using the new Solr servers.