Versioning of Pages

A staging mechanism that supports combining page versions into releases

Composum supports the use of JCR versioning to support a publication flow. Versions of pages can be checked in, and such checkins can be composed into releases that can be previewed and published.

Definitions and general idea

A site provides the frame for versioned resources and releases and configures e.g. which release is public, and whether it is accessible to the public at all.

A release is a collection of versions of versionable resources (e.g. Pages) whose versions are stored using JCR versioning.

A release is tagged by a release number (such as r1, r3.5, r4.8.9) and has some additional metadata (title, description etc). The idea behind the release number - for instance r3.8.17 - is that they have a major number (3) that is increased for e.g. relaunches or major restructurings, a minor number (8) in between major releases and a bugfix number (17) that is increased e.g. if a couple of page updates is published.

Each release can be accessed through a special named StagingResolver, that presents all versionables, their locations in the site resource tree and the attributes on the hierarchy nodes as they were when the versionables were checked in. (This presentation of releases through the resolver is called release mapping.) This way, the Sling components are rendered on the released versions of the versionables as they are stored in the JCR version store.

Releases can get the special labels (called release functions) public or preview. The public and preview release can each be presented e.g. on a separate virtual host. Making a different release public or releasing it as preview to the editors is as easy as moving these labels to a different release.

There are three access modes: author, preview and public. These usually correspond to virtual hosts used for authoring, previewing a release and presenting content to the general public.

Alternatively to rendering a release directly using the StagingResolver, the data visible through it can be can be replicated automatically into separate locations in the resource tree to be rendered directly - by default /public/... and /preview/... instead of /content/... (In-Place Replication).

Delivery of versioned content via REST

The presentation of the releases (usually on various virtual host for authoring, the preview release and the public release) can be configured in the OSGI-Configuration of the following filters, all of which have to be enabled and possibly reconfigured if the public and preview releases are to be presented transparently from JCR version storage using the StagingResolver.

For In-Place replication this is different: the virtual hosts have to be configured using the usual Sling Mappings to present the proper parts of the resource tree (/public/... or /preview/...).

OSGI-configurable filters relevant to release staging

Filter name

Function

Platform Access Filter

Configures which virtual hosts are authoring hosts and which hosts are used to show public content, and which URLs (often resource tree parts) are generally (anonymously) accessible and which not.

Pages Release Filter

If the Access Filter has determined that the virtual host accessed by the current request is not an author host, the Release Filter determines the site the request belongs to and which release should be presented for this kind of host.

It is possible to configure parts of the resource tree / which URL patterns are exempt from this mechanism.

Platform Release Resolver Filter

For public / preview hosts or if requested by a parameter / cookie, this filter replaces the normal Sling ResourceResolver in the request by a StagingResourceResolver.

Configures which parts of the resource tree / which URL patterns are subject to the release mapping mechanism, and for which just the current content is shown.

Platform Staging Release Manager

The release manager service, for which it can be configured which parts of the current content tree (usually primarily the site configuration) are visible for the releases, too.

On author hosts (as determined by the Platform Access Filter), the presented release can also be selected by the parameter cpm.release, which memorizes the release number in the session. Otherwise, the current version is presented.


Release storage in the JCR

The StagingReleaseManager is used to create and maintain the following structure of a site with its releases in the resource tree (JCR content). This is then used by the StagingReleaseResolver to construct the releases resource tree.

example for the structure of releases in the JCR tree
site root (mixin cpl:releaseRoot) - the release / site root jcr:content (mixin cpl:releaseConfig) - the site configuration cpl:releases - a node containing data about the various releases r1.0 - the root for data for the release r1.0 metaData - a node where metadata about the release is stored root - the release content tree containing ... a copy of the working tree for r1.0, referencing versionables in the version storage r1.1 - the root for data for the release r1.0 ... current - the root for data about the current release

For each release, we store a copy of the tree of hierarchy nodes that contain the released versionables for this release in the release content root. The versionables are not stored by themselves, but as references (primary type cpl:versionReference) pointing a specific version of the versionable in the JCR version store.

The view of the resource tree presented through the StagingResourceResolver (and thus also in the case of In-Place replication the structure of the replications in /preview/... and /public/...) replaces the current content of the site root by the content below the release content root. The references to the released versions of the versionables in this tree are replaced by the content of the versionables reconstructed on the fly from the frozen nodes of that version in the version storage.

Additionally, we add the release number prefixed by composum-release- as a label to the versions contained in a release (e.g. composum-release-r3.1.4). This allows easily finding the versionables contained in a release in the version storage by a query.

Tracking the version presented by the StagingResourceResolver

When a release is presented using the StagingResourceResolver (or replicated from this presentation to /public or /preview), a property cpl:replicatedVersion is added to each versionable which contains the uuid of the version that is presented - that is, the uuid of the version of that versionable which is part of the release. Since not all resources admit adding new properties (for example nt:file), a mixin mix:ReplicatedVersionable is added, that defines this property as reference.

Limitations

  • Since the StagingResourceResolver simulates a resource tree from checked in versions in the version storage, the usual querying mechanisms (XPATH or SQL2) cannot easily be used. Instead, we provide a querying mechanism that supports similar functionality to some extend (see section Querying).

Possible Extensions / Open Points

  • Documentation of the user interface
  • The StagingResourceResolver does currently not provide any modification functions, but it could provide such functions for unversioned resources.
  • The Composum Browser could support using a StagingResourceResolver with a specific release.
  • Queries: If you search for the top node of the versioned document (often jcr:content), Query.element works right, but if you build a condition to compare the name it won't find the node. This is very hard to fix, since this is the only node that is renamed (to jcr:frozenNode) and whose actual name is only encoded in history.[default].

QueryBuilder for transparent search within released, unreleased and unversioned content

Since it would be very difficult to emulate search in the version space using query languages like JCR XPATH or SQL2 directly, we instead provide a QueryBuilder that supports both queries over normal JCR ResourceResolvers, as well as over content presented by a StagingResourceResolver for a specific release without bothering over details of the releases internal representation in the JCR.

For building a condition the selected nodes have to match, a Java-internal DSL ("domain specific language") in fluent API style is provided, which covers much of the JCR-SQL2 functionality and is intentionally very similar to SQL2. It does have, however, some limitations - see below.


Limitations

From this result some limitations you need to understand when using this:

  • Joins cannot cross boundaries from outside the release into the release, and from the release hierarchy nodes into released versionables stored in version storage.
  • When searching within the released versionable content, we can internally limit the search only to versionables living below release root, but not to a specific release or path within one release. This is filtered only after performing the actual JCR query, and might lead to problematic performance.

Therefore, if query performance is important or the mentioned boundaries are crossed by the queries, the query mechanism should therefore be only be used on content accessible without a StagingResourceResolver (that is, the current content below a release root or replicated content at /public/... or /preview/...). 

That said, the Query Builder can also be used in general on JCR content as an elegant DSL replacing SQL2 to create queries sporting JavaDoc documentation on most syntactical constructs and partial syntax checking through the Java compiler.

Methods of a Query

Method of Query

Default

Meaning

Examples

path

The absolute path of which the selected nodes must be subnodes.

/somewhere/startpoint

element

Name of the selected nodes.

someelement

type

Type of the selected nodes. Nodes that have this type as supertype or mixin are also found.

nt:base

orderBy

Attribute of the selected nodes by which the result is sorted

jcr:lastModified

ascending

descending

ascending

Switches to asdending or descending ordering when orderBy is set

condition

Sets an optional additional condition that the selected node has to fulfill. A DSL-like builder can be used to create the condition.

Query.notNull(PROP_CREATED)

Query.eq(Query.upper(PROP_TITLE), "THE TITLE")

Query DSL

A query can be created by a QueryBuilder adapted from a ResourceResolver. The following example shows a simple case (where no joins are involved), including the SQL2-Statement that would be equivalent if no releases were involved:

Query q = resourceResolver.adaptTo(QueryBuilder.class).createQuery(); q.path(folder).element("elementname").type("cpp:Component").orderBy("jcr:title").descending(); q.condition(q.conditionBuilder() .name().eq().val("something") .and().property("jcr:title").eq().val("the title") .or().contains("hello")); Iterable<Resource> resourceResults = q.execute(); // or: Iterable<QueryValueMap> columnResults = q.selectAndExecute("jcr:path", "jcr:title", "jcr:score", "rep:excerpt", "jcr:lastModified"); // SELECT n.[jcr:path], n.[jcr:title], n.[jcr:score], excerpt(n), n.[jcr:lastModified] FROM [cpp:Component] AS n // WHERE NAME(n)='something' AND n.[jcr:title]='the title' OR CONTAINS(n.*, 'hello')

The syntax is intentionally very similar to JCR-SQL2, but the implementation in Java allows for code completion with JavaDoc on each syntax construct, and ensures that the condition is syntactically complete. If parentheses are neccesary for complex conditions, these can be created with .startGroup and .endGroup. For example, these ConditionDsl and JCR-SQL2 conditions are equivalent:

q.conditionBuilder().isNotNull("jcr:created").and() .startGroup() .upper().property("jcr:title").eq().val("hello") .or().contains("jcr:title", "something") .endGroup() n.[jcr:created] IS NOT NULL AND ( UPPER( n.[jcr:title] ) = 'hello' OR CONTAINS(n.[jcr:title] , 'something') )

Joins

The Query interface supports the use of joins. However, since querying the versioned releases has to be supported transparently, there are some limitations: the joined resources can only be children or descendants of the selected node, and cannot cross the boundary between unversioned and versioned nodes, and the child condition would fail at the boundary from outside the release tree into the release tree. That is, you have to choose the join such that you know in advance that either the selected node and the joined nodes are all subnodes of a versionable node, or neither of them.

Query q = resourceResolver.adaptTo(QueryBuilder.class).createQuery(); q.path(folder).type("cpp:PageContent").orderBy("jcr:score").descending(); q.condition(q.conditionBuilder().contains("jcr:title", "hello")); QueryCondition joinCondition = q.joinConditionBuilder().contains("world"); q.join(JoinType.Inner, JoinCondition.Descendant, joinCondition); Iterable<QueryValueMap> columnResults = q.selectAndExecute("jcr:path", joinCondition.joinSelector("jcr:path")); // SELECT n.[jcr:path], m.[jcr:path] FROM [cpp:Component] AS n // INNER JOIN [nt:base] AS m ON ISDESCENDANTNODE(m, n) // WHERE CONTAINS(n.[jcr:title], 'hello') AND CONTAINS(m.*, 'WORLD')
Details on JCR versioning usage

Support of full JCR versioning is required. For all documents the jcr:content node is mix:versionable and has a linear history where each version is labelled with the releases it is in, if any. All documents are always checked out.

JCR VersionStorage permissions

Some experimental results about the function of the version storage /jcr:system/jcr:versionStorage :

  • a resolver can access the versions in /jcr:system/jcr:versionStorage only if it can access the path where the current versionable is stored in the default property of the version history. 
  • If a versionable is moved around, the default property is modified automatically to match the new location. 
  • If a document is deleted, the default property of the version history is unchanged and the jcr:versionableUuid is also not changed. 
  • The version history of deleted versionables still only accessible if the resolver can or could read the path in the default property of the version history. (Caution: this means that another user might be able read it if he comes to own that path later.)
This means that no service user is needed for the implementation of the StagingResolver, since the users resolver has exactly the needed rights about reading the version storage.

Important classes

mostly in com.composum.sling.platform.staging and subpackages

Class name

Description

StagingReleaseManager

General entrypoint for managing releases and getting a ResourceResolver (actually a StagingResourceResolver) that gives the contents of the release.

Users of the staging / release mechanism normally only need to be concerned with this class (and possibly the QueryBuilder) since the mechanism emulates a normal ResourceResolver and Resources as far as possible.

ResourceResolverChangeFilter

Injects the appropriate resource resolver (StagingResourceResolver returned from the StagingReleaseManager) into the Sling Request, if a certain release was requested.

The URL/paths that are mapped to releses can be customized with patterns.

StagingResourceResolver

Resolves frozen nodes for a specified release using StagingResource as if they were normal Sling nodes. It provides a view that is as far as possible identical to the situation when the resources are checked in with the versions in the release.

StagingResource

Provides access to frozen nodes as if it was a normal Sling Resource, and also serves as a ResourceWrapper for resources outside the release scope if they are resolved via the StagingResourceResolver.

QueryBuilderQuery

A query builder for transparently accessing releases, which creates a Query.

The queries do depend on whether they apply to a release presented by a StagingResourceResolver, or on not release-mapped content. If the resolver is not a StagingResourceResolver, a single SQL2-query is performed on the nodes below path. However, on a StagingResourceResolver presenting a specific release, two SQL2-queries might need to be executed:

  1. a query for nodes below the release root excluding the releases node, and
  2. a query for nodes below the release content root.
  3. a query to all versions for which the version history's default attribute reaches into the release tree, and which have a label indicating that they belong to the queried release.

All queries return the path of the matching node, the attribute for orderBy and, if applicable, the attributes to be returned by the query.

The final results are the results of Queries 1, 2 and 3 merged - possibly observing the ordering by the orderBy attribute. Query 1 needs to return both versioned and unversioned nodes, since the results of Query 1 and 2 need to be filtered by ReleaseMapper.releaseMappingAllowed(String path): only results of 2 are returned for which ReleaseMapper yields true, and of Query 1 only versioned nodes for which ReleaseMapper yields false (they don't need to match the release), but all unversioned nodes. Query 3 is filtered to whether it applies to a versionable whose location in the release matches the path, since this part of the condition cannot be matched formulated in the query directly.

If the path reaches into a versionable in the release, only the corresponding nodes within that versionable are queried (Query 3).

Coarse structure of the query to search for versioned documents (Query 3)
SELECT n.* FROM [nt:versionHistory] as history INNER JOIN [nt:version] AS version ON ISCHILDNODE(version, history) INNER JOIN [nt:versionLabels] AS labels ON version.[jcr:uuid] = labels.[theRelease] INNER JOIN [nt:frozenNode] AS n ON ISDESCENDANTNODE(n, version) WHERE ISDESCENDANTNODE(history, '/jcr:system/jcr:versionStorage') AND history.[default] like '/folder/%'

selector

meaning

history

top node of versioned node in version storage, with primaryType nt:versionHistory and its path as default attribute.

version

child of history with type nt:version , its jcr:uuid is used in labels

label

the nt:versionLabels that contains the releases as attributes, containing the jcr:uuid of the release node

n

a node to match with the condition

(TODO: Is there any way to restrict labels to be a child of history? This could reduce the query runtime, but there is no way to formulate both the uuid condition and that child condition due to the weird syntax restrictions.)

Subcases:

  • path is outside of versioned content. The full version storage with paths that extend path have to be searched, as well as unversioned content below path.
  • path reaches into versioned content. Only the version storage of this path has to be searched with a modified query.