Skip to end of metadata
Go to start of metadata
On this page:

Publisher Tab

The Publisher tab of the Content Router object in the Netmail Administration Console displays your publisher configuration settings and also allows you to modify the those settings. The Start Service at Boot setting allows you to select whether or not you want to start the publisher service at boot time. The Proxy communication setting allows you to select whether you want to enable or disable proxy communication. The Log Level setting allows you to choose the level of logging you want to apply. It is also possible to change the console password as well as the rules file path. Click Update and Restart to apply your changes and to restart the publisher service.

The View Publisher Console button opens the Publisher Console in a new window. Publisher Console allows you to view progress made by the Publisher and the associated Subscribers. For more information about the Publisher Console, see Viewing the Publisher Console below.

Replicator Tab

The Replicator tab of the Content Router object in the Netmail Administration Console displays the replicator configuration settings. It is also possible to modify the following settings: Start Service at Boot, Log Level, Subscription Check Interval, Offline After, Error Offline After, and Timeout After. Click Update to apply your changes.

The Channel Subscriptions section allows you to add a channel, modify an existing channel, or delete a selected channel.

When you are done, click Update to save your changes.

The View Publisher Console button opens the Publisher Console in a new window. Publisher Console allows you to view progress made by the Publisher and the associated Subscribers. For more information about the Publisher Console, see Viewing the Publisher Console below.

Viewing the Publisher Console

The Publisher Console allows users to view progress made by the Publisher and the associated Subscribers. To better understand the data contained within this console window, a legend has been provided below. To access the Publisher console, click View Publisher Console on either the Publisher or Replicator tab of the Netmail Store Content Router object on the left-hand side of the Netmail Administration Console.

Version

This field displays the version of Content Router installed on the Publisher node.

Uptime

This field displays the amount of time elapsed since the Content Router Publisher service was last started.

Source Cluster

This field displays the group multicast address of the Netmail Store cluster from which the Publisher is gathering UUIDs.

Stream Events Found

This field displays the total number of UUID write, update, and delete events that the Publisher has discovered from the Netmail Store cluster from which it is gathering UUIDs. If the Publisher already has upon start a lot of stream events in its data store, it may take a few hours for the Stream Events Found counter to display a non-zero value.

Filter Backlog

This field displays the number of UUIDs that have been discovered, but have not yet been processed by the Filter Rules Engine.

Space Available

This field displays the size and usage of the Hard Drive being utilized by the Content Router Publisher Node.

Channels

This area contains information for each configured channel or subscription. To view additional details for each channel, click the arrow next to the Channel name, which corresponds to the subscription name in the rules.xml file.

1. Subscriber UUID: The UUID for the subscriber instance of a particular channel. This UUID is created by a Publisher whenever a new Subscriber subscribes to a channel. Because there can be more than one Subscriber on the same Publisher channel, the Publisher must have a unique ID to keep track of each separate subscription.

2. Subscriber Status

  • Working: The subscriber is available and actively processing streams.

  • Offline: The subscriber has not contacted the Content Router Publisher node within the expected configurable frequency.

  • Idle: The subscriber is available but is not currently processing any streams.

  • Busy: The subscriber is currently processing a republish action and is not available for additional republish actions.

  • Paused: The subscriber is actively processing streams and is not taking additional input.

3. Context: The descriptive name for a subscriber instance of a particular channel. For Replicator Subscribers this name is always the same, but it is customizable for other types of Subscribers.

4. Host: The subscriber client’s server IP address.

5. Version: The Content Router software version for the enumerator client.

6. Uptime: The amount of time elapsed since the subscriber’s last reboot.

7. Stream Events Matched: The number of stream events (writes, updates, and deletes including retries) that match the subscriber’s specification, including channel and start/end interval, if applicable.

8. Backlog: The total number of found streams events that a subscriber may be interested in but has not yet completed processing. If there is a non-zero backlog, this is broken down further as follows:

  • Transmit Queue: The number of stream events (writes, updates, and deletes) that have been identified but have not yet been retrieved by the subscriber.

  • Subscriber Queue: The number of stream events (writes, updates, and deletes) that have been retrieved by the subscriber but not yet processed.

  • Subscriber in Progress: The number of stream events (writes, updates, and deletes) that are currently being processed by the subscriber.

9. Dropped: The number of stream events that could not be processed by the Subscriber. For Replicator, events may end up with a Dropped status if:

  • The source cluster is unavailable or unreachable due to configuration or network connectivity issues and objects cannot be replicated

  • An object has been deleted in the source cluster so it is no longer available for replication, but the delete event has not yet been processed by the Publisher/Replicator

  • A lifepoint has deleted the object in the source cluster so it is no longer available for replication

In all these cases, Replicator will keep attempting to replicate streams for a configurable amount of time (4 days by default) and only consider events dropped after that time. If there is known delete activity that accounts for the Dropped events, no further action is necessary. If the cause is unknown or a configuration or connectivity issue has been resolved, administrators may wish to consider Republishing the subscription.

10. Last Connection: The amount of time elapsed since the Subscriber queried the Publisher for stream events.

11. Last Transmit: The amount of time elapsed since the Subscriber last retrieved at least one stream event from the Publisher.

Republish buttons

  • Republish All: This button in the top right-hand corner of the screen reruns all configured rules for all subscribers, including any new or updated rules. It will then restart all existing subscribers.

  • Subscriber Republish:This button next to each subscriber restarts the subscriber. Since the Publisher has no way of knowing the Subscriber's status for each event, all stream events that have been previously sent to the subscriber will be retransmitted as well as any new events. Administrators may wish to republish a subscription if the Subscriber has failed and needs a completely fresh list of events or, in the case of a Replicator, if there are a high number of unexplained dropped events.

Starting Content Router Services

Content Router Publisher and Replicator will attempt to start every time the server on which they are installed boots. Admins should be sure to update the config files for Publisher and/or Replicator after installation to ensure Content Router has the necessary information to start correctly. If a service was stopped for any reason it can be manually started with a standard init.d script. For mirrored configurations with both services on the same server, Publisher and Replicator must be started separately as follows:

 $ sudo /etc/init.d/cr-publisher start
 $ sudo /etc/init.d/cr-replicator start

Publisher and Replicator Shutdown

To stop a Content Router service, use one of the following commands:

 $ sudo /etc/init.d/cr-publisher stop
 $ sudo /etc/init.d/cr-replicator stop

Customizing the Standard Rule Sets

The standard rule set can easily be customized to control what content is published on a given channel. Each Publish statement, or ‘channel’, is evaluated against the full list of known UUIDs. Multiple Replicators can subscribe to the same channel when replicating the same content to more than one remote cluster. Channel names are case sensitive.

The default channel in the rules.xml.sample provided with Publisher is named PublishAll. This channel name is reserved, and it intentionally has no filtering criteria to optimize performance. If you do not wish to use PublishAll, do not set up your Replicators to subscribe to it. (Replicator subscriptions are determined using the subscribeTo configuration parameter.)

Important: Do not modify the PublishAll rule. If you add any filter criteria to the PublishAll rule, Publisher logs an error and fails to start.

Whenever rules.xml is modified, regardless of whether Publisher is running or stopped, any Replicator sessions on non-default channels are terminated because the list of previously filtered events is no longer guaranteed to be accurate for the new rules. Replicators that receive a 404 on a request (because a session has been terminated) must restart their subscriber session(s). Replicator sessions on the default PublishAll channel are not affected by a rules file change.

Note: Publisher supports the ReplicateAll channel, which was the default in earlier versions of Content Router. The ReplicateAll channel displays on the Publisher console as PublishAll (irrespective of the name displayed in the rules.xml file) and behaves in all other ways the same as the PublishAll channel.

The following examples illustrate some alternate rule implementations.

Publish all streams on a single channel

This is the same as the default "PublishAll" case.

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE rule-set PUBLIC "-//CARINGO//DTD 1.0 PORTALRULES//EN"
    "file:/tmp/caringo/rules.dtd">
 <rule-set>
   <publish>
     <select name="subscriptionName">
     </select>
   </publish>
 </rule-set>

Publish all streams on two separate channels

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE rule-set PUBLIC "-//CARINGO//DTD 1.0 PORTALRULES//EN"
    "file:/tmp/caringo/rules.dtd">
<rule-set>
  <publish>
    <select name="subscriptionName1">
    </select>
  </publish>
<publish>
<select name="subscriptionName2">
    </select>
  </publish>
</rule-set>

Publish streams with header "someHeaderName" on one channel and all others except text files to a second

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE rule-set PUBLIC "-//CARINGO//DTD 1.0 PORTALRULES//EN"
    "file:/tmp/caringo/rules.dtd">
<rule-set>
  <publish>
    <select name="subscriptionName1">
      <exists header="someHeaderName"/>
    </select>
    <select name="subscriptionName2">
      <filter header=”storage-filepath”>
        (not contains(‘.txt’))
      </filter>
    </select>
  </publish>
</rule-set>

Complex Content Metadata Analysis

When determining what content will be replicated, you can use a variety of combinations of the filter, exists, and notexists clauses. In addition, the filter clause supports any combination of olderThan(), matches(), intValue() and contains(). Below are some usage examples.

Filter Clause

<select name="subscriptionName1">
  <filter header=”headerName”>
    olderThan("Tue, 16 Oct 2007 00:00:00 GMT")
  </filter>
</select>

<select name="subscriptionName1">
  <filter header=”headerName”>
    olderThan(‘365d’)
  </filter>
</select>

<select name="subscriptionName1">
  <filter header=”headerName”>
    matches(‘.*filename\s*\=.*\.txt.*’)
  </filter>
</select>

<select name="subscriptionName1">
  <filter header=”headerName”>
    contains(‘.txt’)
  </filter>
</select>

<select name="subscriptionName1">
  <filter header="headerName">
    <![CDATA[
      intValue() < 50000
    ]]>
  </filter>
</select>
 
<select name="subscriptionName1">
  <filter header="headerName">
    intValue() != 0
  </filter>
</select>
  
<select name="subscriptionName1">
  <filter header=”headerName”>
    (not contains(‘.txt’)) and olderThan(‘365d’) and contains(‘caringo’)
  </filter>
</select>
 
<select name="subscriptionName1">
  <filter header=”headerNameXXX”>
  (not contains(‘.txt’)) and olderThan(‘365d’) and contains(‘caringo’)
  </filter>
  <filter header=”headerNameYYY”>
    contains(‘someStringValue’)
  </filter>
</select>

Exists Clause

<select name="subscriptionName1">
  <exists header=”headerName”/>
</select>
  
<select name="subscriptionName1">
  <exists header=”headerNameXXX”/>
  <exists header=”headerNameYYY”/>
</select>

NotExists Clause

<select name="subscriptionName1">
  <notexists header=”headerName”/>
</select>
  
<select name="subscriptionName1">
  <notexists header=”headerNameXXX”/>
  <notexists header=”headerNameYYY”/>
</select>

All Clauses

<select name="subscriptionName1">
  <filter header=”headerName”>
    olderThan(‘365d’)
  </filter>
  <exists header=”headerNameXXX”/>
  <notexists header=”headerNameYYY”/>
</select>

HTTP Status Reporting for Publisher and Replicator

Both the Publisher and the Replicator will provide basic status information in response to a GET request from any HTTP client as follows:

For Publisher:

GET http://<PublisherIP>:PublisherConsolePort/status

For Replicator:

GET http://<ReplicatorIP>:ReplicatorConsolePort/status

The response to the request is a standard HTTP response but the response body differs slightly for each service. All active enumerators will be included in the response in the same order with every request. Enumerators may be deleted, however, so the index may change. If either Content Router service is installed on a Cluster Services Node (CSN), the response from the request is used to populate the SNMP MIB for the respective service on the CSN.

Publisher Response

The data in the Publisher response corresponds to the same fields available on the Publisher console.

 [settings]
 hostip={publisher ip address or host name}
 sourceCluster={source cluster multicast group}

 [stats]
 version={publisher software version}
 upTime={seconds since Publisher started}
 eventsFound={number of distinct stream events heard by Publisher}
 filterBacklog={number of stream events remaining to be info-ed and filtered}
 diskMBAvailable={total disk space available on Publisher data store file system}
 diskMBCapacity={total disk space capacity on Publisher data store file system}
 cycleTime={seconds elapsed for most recent completed data store merge cycle}
 cycleStart={start of current data store merge cycle}
 cycleNumber={data store merge cycle count during current Publisher run}
 [enumerator1]
 uuid=...
 channel={rules file channel for which the enumerator was instantiated}
 context={optional description for enumerator}
 host=...
 version=...
 status=...
 eventsMatched=...
 backlogTotal=...
 transmitQueue=...
 subscriberQueue=...
 subscriberInProgress=...
 droppedEvents=...
 lastConnection=...
 lastTransmit=...
 [enumerator2]
 uuid=...
 .
 .
 .
 [enumerator3]
 uuid=...
 .
 .

Replicator Response

 [settings]
 hostip={replicator ip address or host name}
 targetCluster={target cluster multicast group}
 [stats]
 version={replicator software version}
 upTime={seconds since Publisher started up}
 [enumerator1]
 uuid=...
 publisherIP=...
 channel=...
 subscriberQueue=...
 subscriberInProgress=...
 droppedEvents=...
 lastConnection=...
 lastTransmit=...
 [enumerator2]
 uuid=...
 .
 .
 [enumerator3]
 uuid=...
 .
 .

Request for Source Cluster IP Addresses

Publisher clients such as Replicator may need to know IP addresses of nodes in the Netmail Store source cluster that Publisher listens to (in order to make SCSP requests to this cluster). The Publisher Channel Server will support a request for a list of source cluster IP addresses and SCSP port numbers:

 GET /sourceclusteripports.bin HTTP/1.1

You will want to provide the publisher's IP address & the Publication Server Port as part of the request. For example:

 http://publisherIp:publicationServerPort/sourceclusteripports.bin

A normal response to this request is:

 HTTP/1.1 200
 Date: ...
 Server: Content Router Publisher
 Content-Type: text/plain
 Content-Length: ...

Each line of the response body will be of the form "<source cluster IP address>:<SCSP port number>". Over time, new source cluster nodes may come online and others may go offline, so it may become necessary for a client to update its list of IP addresses. This could be done, for instance, whenever a previously accessible source cluster node is no longer accessible, or at regular time intervals.

  • No labels