Skip to end of metadata
Go to start of metadata


This appendix describes the enumeration commands and associated data types in the Content Router Publisher server interface that you can access in Content Router Replicator to provide reliable object replication across wide area networks.

On this page:

 

The Content Router Publisher provides a public HTTP 1.1 server interface to enable a simple, standards-based approach for building Netmail Store internal and third-party applications similar to plug-ins that require some form of object enumeration. The Content Router Replicator service uses this interface extensively to provide reliable object replication across wide area networks. Examples of other applications might include third-party object replication or backup applications, object index or search engines, metadata query engines, and virus scan applications.

The base URL for any enumerator is: http://publisher-ip-address:publication-server-port/

where publisher-ip-address is specified by the value of the ipaddress configuration variable and publication-server-port is specified by the value of the publicationServerPort configuration variable (by default, 80 for a non-CSN installation
and 8080 for a CSN installation).

Enumerator Types

Enumerator types are defined based on the type of data that should be returned with the response. The following table lists the supported enumerator data types.

Enumerator Data TypeDescription
Metadata

(Default) Returns the name or UUID and all associated metadata for objects that match the specified filters. Use the Metadata enumerator when you want to see the metadata associated with each object.

ListSupports unnamed and named objects. For immutable unnamed objects and for mutable unnamed objects, it returns the object's UUID. For named objects, it returns the object's qualified name.
For example: /cluster.example.com/mybucket/myobject.html
UUID

(Deprecated) Returns the UUIDs for all unnamed objects that match the specified filters.

Event

Returns the events that occurred in the cluster.

Each entry contains a name or UUID along with the type of event that occurred (create, update, or delete). Named and unnamed Netmail Store objects are returned by the Event enumerator. For named objects, the fully qualified object name is returned (for example, /domain-name/bucket-name/object-name). To distinguish a named object from an unnamed object, your client should look for the leading slash character.

Note:

  • For the List and Event enumerator types, objects are returned as a comma-separated list. Because named objects are returned as URL percent encoded values, commas in object names are encoded as %2C.
  • Any versioned object (which excludes immutable objects) that was updated can result in enumerators giving multiple results for the same object. This occurs because mutable object UUIDs for creates and updates are included regardless of subsequent deletes for proper processing of all object revisions across different nodes and clusters.
  • When you change a domain name, it can take a while for enumerations to start using the new name for named objects in that domain.

Overview of Enumeration

In general, you begin an enumeration with a call to Enumeration start(), then call Enumeration >next() a number of times. Because new content can be added to a cluster at any time and Content Router requires time to build its lists of the objects in a cluster, the cluster object list is never definitive. The Events-Available and Events-Sent response headers enable an enumerator's next() method to determine if it is "caught up" with the number of events currently known to the Publisher. However, it is possible to receive 0 Events-Sent on an Enumerator Next request and then receive non-zero Events-Sent on a subsequent Next request as new events become available on the channel.

Enumeration end() finishes the enumeration and releases the associated resources. It is up to you—not Content Router—to determine when the enumeration ends. If a call to next() returns no events, it means that Content Router found nothing matching at that point in time. It does not mean the list was completely enumerated. Content Router responds to a GET or DELETE request with 404 (Not Found) if the name or UUID was not found, or a 400 (Bad Request) for an invalid request (such as an invalid UUID).

If a call to next() returns no events, it means that Content Router found nothing matching at that point in time. It does not mean the list was completely enumerated.

Content Router responds to a GET or DELETE request with 404 (Not Found) if the name or UUID was not found, or a 400 (Bad Request) for an invalid request (such as an invalid UUID).

Enumerator Start

The Enumeration Start command instantiates an object enumerator for a given channel in Publisher, and returns a unique identifier for this enumerator. The format of the request is as follows:

POST /<channel>?type=<enumerator type>&start=<date-time1>&end=<date-time2>

HTTP/1.1

Host: <publisherhost>

Here channel is the "subscription name" as specified when configuring a Content Router replicator service. It corresponds to one of the sets of filter rules identified by a select tag in the publisher rules.xml file. The start and end parameters delimit the create dates of objects to be enumerated for metadata and UUID enumerator types. Event type enumerators do not support start and end dates. Both dates are ISO 8601 date-time values; RFC 1123 formatted date-time values are not yet supported. The time-of-day specification may be omitted, in which case the time 00:00:00 is assumed. The start date-time is inclusive, the end date-time non-inclusive.

The following table describes the parameters in the basic request:

ParameterDescription
channel-name

Defined in the Publisher rules.xml file by a select name= XML tag.

A Replicator subscribes to one or more Publisher channels (specified by the value of the subscribeTo configuration parameter in replicator.cfg).

type

One of the following:

  • Metadata (default, used if no type is specified)
  • List
  • UUID
  • Event

For more information, see Enumerator Types.

start

(Not supported by the Event enumerator) Start of the time range to enumerate, date/time inclusive, in ISO 8601 format.

Omitting the time-of-day assumes a time of 00:00:00 corresponds to the start. All time stamp comparisons are based on file creation timestamps.

end

(Not supported by the Event enumerator) End of the time range to enumerate, date/time non-inclusive, in ISO 8601 format. Omitting the time-of-day assumes a time of 00:00:00.

Enumerator Start Response

Below is a sample normal response to an Enumerator Start command:

HTTP/1.1 201 Created

Date: ...

Server: Content Router Publisher g9

Content-UUID: 41a140b5271dc8d22ff8d027176a0821

Content-Enum-Type: List

Content-Enum-Channel: channel-name

Content-Sync-Token: token-value

Events-Available: 122770

Events-Sent: 0

Content-Type: text/plain

Content-Length: bytes

The following table describes the response headers in the example:

Response HeaderDescription
Server

Standard HTTP 1.1 header that specifies the Content Router software version running on the responding node.

Content-UUIDThe numerator UUID of the object returned by the enumerator.
Content-Sync-Token

Used by the enumerator Next command to indicate the previous request was successfully received.

Content-TypeStandard HTTP 1.1
Content-LengthStandard HTTP 1.1
Events-Available

Indicates the total number of events available to the subscriber. Note that the number of events the Publisher knows about can increase at any time due to activity in the cluster.

Events-Sent

Number of events sent to the subscriber (including the current request). For the start() method, this number is always 0.

Content-Enum-Type

Indicates the enumerator type (List, UUID, Event, or Metadata) returned from the enumerator start().

Content-Enum-Channel

The channel name returned from the enumerator start().

The (optional) start and end date-times in the response body are returned as UNIX epoch time seconds. If for any reason the request is not successful, a 404 response code is returned with a descriptive message in the response body about the encountered problem.

Enumerator Next

The Enumerator Next command is a request for the next objects in an enumeration that was previously initiated with an Enumerator Start command. For performance reasons Publisher does not attempt to retrieve elements in a specific order. The format of the basic request is as follows:

GET /Enumerator-UUID?maxItems=max-objects-to-retrieve HTTP/1.1

Host: publisher-host-or-ip

Notes:

  • With the Events-Sent response header, the subscriber no longer needs to keep track of how many events were received. Even when Events-Sent is equal to Events-Available, the subscriber should not assume that it has received all events that occurred in the Netmail Store source cluster because the list of events compiled by Content Router cannot be assumed to be definitive.
  • Whenever the rules.xml file is modified (regardless of whether Publisher is running or stopped when the modification occurs), any enumerator sessions on non-default channels are terminated because the list of previously filtered events is no longer guaranteed to be accurate for the new rules. Enumerators that receive a 404 message on a request (because a session was terminated) automatically restart their subscriber session(s).

Replicator sessions on the default PublishAll channel are not affected by a rules file change.

Enumerator Next Query Arguments

The following table lists the optional query arguments that are defined for the Next command only and used to convey status information about the Enumerator to the Publisher. The Publisher will use some but not all of this information to update Enumerator status on the Publisher console.

Argument NameDescription
upTime

(Deprecated) The number of seconds since the subscriber was started.

backLog

(Deprecated) The number of items the subscriber is holding that it has not begun processing.

inProgress

(Deprecated) The number of items the subscriber is actively processing.

dropped

(Deprecated) The number of items that the subscriber failed to process. The dropped query argument provides the subscriber a way to report possible trouble conditions to the Publisher. Subscribers should take care to only "drop" events after sufficient retries. Note: Dropped events will not be resent by the Publisher without manual intervention using the Republish or Republish All commands from the Publisher Console.

syncToken

The value of the Content-Sync-Token header from the last successfully received Start or Next response.

This header is used to confirm with the Publisher that a request was successfully received and ensure that UUIDs are not missed due to a network or connectivity issue. If the syncToken matches the one expected by the Publisher from the previous request, the next set of requested UUIDs will be sent. If the syncToken does not match, the Publisher will resend the previous set of UUIDs. If the syncToken is not present on the request, the Publisher will automatically send the next set of requested UUIDs.

Note: Query arguments listed as deprecated in the preceding table are accepted in Netmail 5.3 but are ignored. These query arguments will be removed in future Content Router documentation.

Enumerator Next Response

Below is a typical normal response to an Enumerator Next command of type List, UUID, or Event:

HTTP/1.1 200 OK
Date: ...
Server: Content Router Publisher g9
Events-Available: 122770
Events-Sent: 5005
Content-Sync-Token: token-value
Content-Enum-Type: List
Content-Enum-Channel: channel-name
Content-Type: text/plain
Content-Length: bytes

For a Metadata Enumerator, the response has an additional Content-UUID header containing the object's UUID. The following table provides descriptions of the response headers:

Response HeaderDescription
Server

Standard HTTP 1.1 header that specifies the Content Router software version running on the responding node.

Content-UUID(Metadata enumerator only) The numerator UUID of the object returned by the enumerator.
Content-Sync-Token

Used by the enumerator Next command to indicate the previous request was successfully received.

Content-TypeStandard HTTP 1.1 header
Content-LengthStandard HTTP 1.1 header
Events-Available

Indicates the total number of events available to the subscriber. Note that the number of events the Publisher knows about can increase at any time due to activity in the cluster.

Note: If Events-Sent is equal to Events-Available, the subscriber should not assume that it has received all events in the Netmail Store source cluster because Content Router does not build a list of all objects in a cluster.

Events-Sent

Number of events sent to the subscriber (including the current request).

Content-Enum-Type

Indicates the enumerator type (List, UUID, Event, or Metadata).

Content-Enum-Channel

The channel name.

The content and format of each line of the response body varies by enumerator type as follows:

  • List enumerator: Returns the object's qualified name or UUID
  • UUID enumerator: Format is "uuid"
  • Event enumerators: Format is "qualified-name-or-uuid,event-type", where eventtype is one of the following:

Note: The event-type shown for any mutable object (including named objects) will include all versions of the object, including updates and deletes.

    • 1 (deleted object)
    • 2 (immutable unnamed object)
    • 4 (mutable unnamed object)
    • 8 (named object)
    • 16 (domain or bucket)
  • Metadata enumerators: Format is "header-name: value" (for each header present on the object).

Note: For List, UUID, and Event enumerators, between 0 and maxItems lines are returned—one event or object per line. The format of each line depends on the enumerator type. The Metadata enumerator provides one object per next request. The lines are the metadata tags on the object.

Metadata enumerators contain only metadata for a single object. However, List, UUID and Event enumerators support inclusion of data for multiple objects. If for any reason the request is not successful, a 404 response code is returned with a descriptive message in the response body as to the encountered problem. If the enumerator is caught up and there are no events sent with the Next request, the content body of the response will be null.

Enumerator End

The Enumerator End command is called to end an enumeration that was previously initiated with an Enumerator Start command. The format of the request is as follows:

DELETE /enumerator-uuid HTTP/1.1

Host: publisher-host-or-ip

End Response

The normal response to an Enumerator End command is as follows:

HTTP/1.1 200
Date: ...
Server: Content Router Publisher g9
Content-Enum-Channel: channel-name
Content-Type: text/plain
Content-Length: ...

Object Enumerator deleted

The following table describes the response headers. If for any reason the request is not successful, a 404 response code is returned with a descriptive message in the response body as to the encountered problem.

Response HeaderDescription
Server

Standard HTTP 1.1 header that specifies the Content Router software version running on the responding node.

Content-UUIDThe UUID of the deleted object returned by the enumerator.
Content-TypeStandard HTTP 1.1 header
Content-LengthStandard HTTP 1.1 header
Content-Enum-Channel

The channel name.

Enumerator Timeout

The Content Router Publisher uses the subscriberTimeoutInterval configuration parameter to terminate and remove the enumerator for any channel if the enumerator has not responded to a GET in the specified length of time. You can adjust its value in the publisher.cfg file. By default, this value is set to 90000 seconds (25 hrs). An individual Enumerator may override the publisher timeout value. This is done by including a timeout query argument on the POST or GET request, as follows:

POST /channel-name?start=date-time1&end=date-time2&type=enumtype&timeout=86400 HTTP/1.1

Configuration and Status Query Arguments

The following table describes the optional query arguments that can be supplied on either an enumerator start() or next() method.

Argument NameDescription
version

The enumerator client's software version string. When provided, this value will be displayed on the Publisher console in association with the channel or subscription rule set name.

evType

Used by an enumerator's start() method to qualify the results it returns. Omitting the evType query argument returns all events that would normally be returned by the specified enumerator type (that is, the start() method's type query argument). Values for evType can be a combination (by addition) of some subset of the following values, subject to the type of implemented enumerator:

  • 1. Matches deleted objects of the kind specified by the other values.
  • 2. Matches existing immutable objects.
  • 4. Matches existing mutable unnamed objects.
  • 8. Matches existing named objects.
  • 16. Matches buckets or domains.

Use the following guidelines:

  • If type=UUID, only some combination of 2, 4, and 16 can be used.
  • If type=MetaData or type=List, only some combination of 2, 4, 8, and 16 can be used.
  • If type=Event, any combination can be used except 1 (alone).
 autoRestart

Useful if multiple subscribers share the same enumerator to enable UUID processing to continue to be shared if there is a change to the rules.xml file. A change to this file means that UUIDs must be re-evaluated. Set this argument to one of the following case-insensitive values:

true. After a change to the rules.xml file, UUID processing continues to be shared between the subscribers

  • false. Causes obsolete UUIDs to be re-evaluated when any of the following is true:
    • The enumerator is explicitly deleted by the subscriber.
    • The rules.xml file is modified in any significant way, such as adding or removing filtering statements.
    • The subscriber times out from inactivity.

The default behavior is determined as follows:

  • If no value is specified for ?autoRestart, it defaults to true.
  • If the query argument is never sent by the subscriber, it is assumed to be false.
nameThe subscriber name.
ipaddressThe subscriber client's server IP address. If you are using an alias IP address, the displayed IP will be for the host server, not the alias on the server.
status

Indicates one of the following states to display on the Publisher Admin Console:

  • Idle. The subscriber is not performing work. No events should be retrieved in this case.
  • Working. The subscriber is performing work and making progress. Indicates event retrieval can continue.
  • Busy. The subscriber must perform more processing before it resumes processing events. No events should be retrieved in this case.
  • Error. The subscriber is in an error state with detail provided.
statusDetail

Detailed status information to display on the Publisher Admin Console. The value must be in URL-escaped format.

context

(Deprecated and replaced by name) The descriptive name for an enumerator instance of a particular channel. When provided, this value displays on the Publisher console in the expandable section with enumerator details.

maxItems

The maximum number of items to send in this request response. The enumerator will remain paused until a non-zero maxItems value is received. If neither the enumerator start or a next request specifies a value for the maxItems parameter, Publisher uses the default value of 5000. A negative or non-integer value is interpreted as a 0. You can change the value by adjusting the enumeratorDefaultMaxItems configuration parameter in publisher.cfg.

Note: The maxItems query argument is ignored by the Metadata enumerator.

offlineAfter

(Deprecated) The number of seconds the Publisher should wait after the last enumerator query before declaring the enumerator as off-line. When used, this parameter should be some multiple of the expected interval between enumerator queries. When not specified, a default value for offlineAfter will be derived from the Publisher's subscriberOfflineAfter configuration value.

errOfflineAfter

(Deprecated) Number of seconds the Publisher should wait after the last enumerator query before showing an off-line error. When used, this parameter should be some multiple of the expected interval between enumerator queries. When not specified, a default value for errOfflineAfter is derived from the Publisher's subscriberErrOfflineAfter configuration value.

timeout

The number of seconds before the Publisher terminates an unresponsive enumerator. The minimum supported value is 600 seconds (10 minutes). If this value is not set, the value of the configuration variable subscriberTimeoutInterval is used.

Notes About the Event and List Enumerators for Named Objects

This section applies to using the Event and List enumerators to enumerate domains, buckets, or named objects when the domain or bucket is unavailable at the time of enumeration. This section does not apply to unnamed mutable or unnamed immutable objects.

Note: Netmail recommends that before you delete a container object, you delete everything in the container. For example, before you delete a bucket, delete all objects contained in the bucket. Doing so avoids the issues discussed in this section.

When you enumerate named objects, keep the following rare cases in mind:

  • If a bucket is deleted and added with the same name, the format of the qualified name of objects in the bucket change.

For example, suppose you create the following object:

/cluster.example.com/mybucket/myobjects/myobject.html.

This is referred to as the object's qualified name because it includes the domain name (cluster.example.com), the bucket name (mybucket), and the object name (myobjects/myobject.html). When an enumerator returns an object name, it is usually in this format.

However, if a user deletes mybucket and later creates another bucket named mybucket in cluster.example.com, the enumerator returns the qualified object name with the following differences:

    • The leading slash character is missing.
    • The bucket's name is replaced by the bucket's UUID.

So instead of returning /cluster.example.com/mybucket/myobjects/myobject.html, the enumerator returns mybucket-UUID/myobjects/myobject.html.

  • If the Netmail Store source cluster is not available or if the volume storing a bucket no longer contains the object, the enumerator returns the following:
    • The object names are returned as <expired>.
    • The container names are returned as UUIDs instead of names. (The container for an object is a bucket; the container for a bucket is a domain.)
  • No labels