Schematron Content Validation

Schematron enables you to enforce content architecture rules for your authoring team.

Schematron Use Case

Schematron is a perfect solution if you want to validate your DITA content architecture by enforcing structural rules specific to your organization. For example, you can define a rule that enforces an unordered list element to have at least two list item elements.

If you aim to extensively validate the language layer of your content for grammar and style, consider integrating your Heretto CCMS instance with an external language-validation tool (for example, HyperSTE). For more information, contact your Customer Success Manager.

Basic Schematron Rules

When Schematron is enabled, Heretto CCMS continuously evaluates your content and structure against any configured Schematron rules. When it detects a match, Heretto CCMS surrounds the matching content with an orange border, and inserts a tick mark in the right side of the Content Editor. Hover over the tick mark to review the triggered Heretto CCMS rule.


Schematron rule example

Schematron Rules with QuickFix

Basic Schematron rules can be extended to trigger defined QuickFix activities with a click of a button. Currently, the QuickFix implementation in Heretto CCMS enables you to either add or replace XML nodes, that is, elements or attributes.


Schematron QuickFix example

Schematron Rules Development

Out of the box, Heretto provides a number of generic Heretto rules.

To implement custom Schematron rules, contact your Customer Success Manager. You can compose the rules on your own and provide them to Heretto for installation in your instance.

Schematron Rules Reference

You define Schematron rules in an SCH file.

Tip: Schematron supports XSLT2 and use operators, functions, and syntax from the XPath language. For detailed information, see XPath.

Schematron Schema Example

The example includes both Basic Schematron Elements and Schematron QuickFix Elements (bold).

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron"
    xmlns:sqf="http://www.schematron-quickfix.com/validator/process" queryBinding="xslt2">
    
    <!-- A Step or Substep element should not contain more than one Choices, Substeps, Step Example, or Info element. -->
    <pattern id="STRUCTURE_05">
        <rule context="step|substep">
            <assert test="count(info) &lt; 2">A Step or Substep element should not contain more than one Info
                element.</assert>
            <assert test="count(stepxmp) &lt; 2">A Step or Substep should not contain more than one Step
                Example element.</assert>
            <assert test="count(choices) &lt; 2">A Step or Substep should not contain more than one Choices
                element.</assert>
            <assert test="count(substeps) &lt; 2">A Step should not contain more than one Substeps
                element.</assert>
        </rule>
    </pattern>

    <!-- ... -->

    <!-- Unordered List and Ordered List elements should contain at least two List Item elements -->
    <pattern id="STRUCTURE_11">
        <rule context="ul|ol">
            <assert test="count(li) &gt; 1" sqf:fix="addListItem">Unordered List and Ordered List
                elements should contain at least two List Item elements.</assert>
            <sqf:fix id="addListItem">
                <sqf:description>
                    <sqf:title>Add a List Item element</sqf:title>
                    <sqf:p>To fix this issue, you can add another List Item element. If you cannot
                        come up with another List Item element, convert the list element to a Paragraph
                        element.</sqf:p>
                </sqf:description>
                <sqf:add node-type="element" target="li" position="last-child"/>
            </sqf:fix>
        </rule>
    </pattern>

</schema>

Schematron Schema Structure

The following structure includes both Basic Schematron Elements and Schematron QuickFix Elements (bold).

<schema> (single)

  • <pattern> (any number)
    • <rule> (any number)
      • <report> (any number)
      • <assert> (any number)
      • <sqf:fix> (optional, any number)
        • <sqf:add> (any number)
          • <sqf:description> (single)
            • <sqf:title> (single)
            • <sqf:p> (optional, single)
        • <sqf:replace> (any number)
          • <sqf:description> (single)
            • <sqf:title> (single)
            • <sqf:p> (optional, single)

Basic Schematron Elements

<schema>
The root element of a Schematron configuration file.

The <schema> element should include the following arguments:

<schema xmlns="http://purl.oclc.org/dsdl/schematron" xmlns:sqf="http://www.schematron-quickfix.com/validator/process" queryBinding="xslt2">

The namespaces are specific to Schematron 2.0.

<pattern id="yourLabelHere">
Groups related rules that are related in some way. For example, grammatical rules.

Requires an @id attribute assigned.

<pattern id="STRUCTURE_11">
The pattern @id attribute value is visible in the Schematron flag headers that appear in Heretto CCMS.
Grammar example
<rule>
Defines a context, by using XPath 2.0, (XPath) in which to apply the Schematron assert or report test on the content.

In this example, the Schematron rule would be applied to unordered and ordered list elements only.

<rule id="OrderedUnorderedLists" context="ul|ol">

In this example, the Schematron rule context detects whether a section element is the first section in a reference body element.

<rule id="RefbodySections" context="refbody/section[count(preceding-sibling::*) = 0]">
        
<assert>
Triggers a rule when the @test attribute value (XPath) evaluates to false.

In general, use an assert to evaluate whether something is missing in the content or the structure.

In the following example, a rule is triggered when a list contains only one list item.

<assert test="count(li) > 1">A list must have at least two list items.</assert>
<report>
Triggers a rule when the @test attribute value (XPath) evaluates to true.

In general, use a report to evaluate whether something is present in the content or the structure, and shouldn't be.

In the following example, a rule is triggered when you enter displays.

<report test="contains(., 'displays')">Use "shows" or "appears" instead.</report>

Schematron QuickFix Elements

Important: Currently, Heretto CCMS enables you to use the <sqf:add> and <sqf:replace> QuickFix actions.
<sqf:fix>
Contains either <sqf:add> or <sqf:replace> QuickFix actions.

Requires an @id attribute assigned.

<sqf:fix id="addListItem">

To link a QuickFix action, the @id attribute value must be referenced in the corresponding <report> or <assert> element.

<assert test="count(li) &gt; 1" sqf:fix="addListItem">A list must have at least two list items.</assert>

To apply a QuickFix action automatically, add the j:auto-apply-fix="true" to the corresponding <report> or <assert> element.

<assert test="count(li) &gt; 1" sqf:fix="addListItem" j:auto-apply-fix="true">A list must have at least two list items.</assert>
Warning: We recommend thoroughly testing a given rule before using the j:auto-apply-fix="true" attribute.
<sqf:description>
Can contain an <sqf:title> element and an <sqf:p> element.
<sqf:title>
The title of <sqf:fix> element. In Heretto CCMS, it is rendered as a QuickFix button.
QuickFix button example
<sqf:p>
An additional description of the QuickFix action. In Heretto CCMS, the description appears below the QuickFix button.
QuickFix with dsescription
<sqf:add>
A QuickFix action that adds an XML node, for example, an element or an attribute.
Should have the @node-type, @element, and @position attributes assigned.
<!-- Unordered List and Ordered List elements should contain at least two List Item elements -->
<pattern id="STRUCTURE_11">
    <rule context="ul|ol">
        <assert test="count(li) &gt; 1" sqf:fix="addListItem">Unordered List and Ordered List
            elements should contain at least two List Item elements.</assert>
        <sqf:fix id="addListItem">
            <sqf:description>
                <sqf:title>Add a List Item element</sqf:title>
                <sqf:p>To fix this issue, you can add another List Item element. If you cannot
                    come up with another List Item element, convert the list element to a Paragraph
                    element.</sqf:p>
            </sqf:description>
            <sqf:add node-type="element" target="li" position="last-child"/>
        </sqf:fix>
    </rule>
</pattern>
<sqf:replace>
A QuickFix action that replaces an XML node with another XML node.
<!-- Refer to the CCMS product by using the `varsProperNouns/productName` conkeyref -->
<pattern id="STYLE_30">
    <rule context="text()[matches(.,'[Hh][Ee][Rr][Ee] [Tt] [Oo]\W')]">
        <report test="." sqf:fix="productName">Refer to the CCMS product by using the `varsProperNouns/productName` conkeyref.        </report>
            <sqf:fix id="productName">
                <sqf:description>
                    <sqf:title>Insert Conkeyref</sqf:title>
                    </sqf:description>
            <sqf:replace match="productName">
                <ph xmlns="" conkeyref="varsProperNouns/companyName"/>
            </sqf:replace>
            </sqf:fix>
    </rule>
</pattern>

For more information about Schematron QuickFix elements and attributes, see the SQF User Guide.

Schematron Rules Development Guidelines

Keep the following guidelines in mind when developing Schematron rules.

Performance Guidelines

  • Be as specific as possible when you define tests and contexts for your rules
    <!-- Prerequisites, Postrequisites, Step Example, Step Result, Example, Result, and Context elements should not contain introductory text. The text is generated automatically (Specific to: heretto_pdf, color_pdf, gray_pdf publishing scenarios) -->
    <pattern id="STRUCTURE_10">        
        <rule context="prereq/p[1]">
            <report test="matches(., '^[Bb]before\s*you\s*[Bb]egin')">Remove the introductory text. The
                text is generated automatically.</report>
            <report test="matches(., '^[Bb]efore\s*you\s*[Ss]tart')">Remove the introductory text. The
                text is generated automatically.</report>
            <report test="matches(., '^[Pp]rerequisites')">Remove the introductory text. The text is
                generated automatically.</report>
        </rule>
    
        <!-- ... -->
    
        <rule context="context/p[1]">
            <report test="matches(., '^[Cc]ontext')">Remove the introductory text. The text is
                generated automatically.</report>
        </rule>
    </pattern>
  • If you decide to develop a rule that validates the language layer of your content, specify the regular expression matches in the rule context rather than in the report or assert test
    <!-- Replace "display" with "show" or "appear". -->
    <pattern id="STYLE_07">
        <rule context="text()[matches(.,'[Dd]isplay(s|ing)?\W')]">
            <report test=".">Replace "display" with "show" or "appear". </report>
        </rule>
    </pattern>
  • If you need to implement a significant number of rules (50 or more) that validate the language layer of your content, consider integrating your Heretto CCMS instance with a dedicated language-validation tool (for example, HyperSTE)

Stylistic Guidelines

  • Be consistent with your assert or report warnings
    <!-- Each topic root element should have an @xml:lang attribute assigned -->
    <pattern id="STRUCTURE_02">
        <rule context="concept|task|reference|topic|troubleshooting">
            <assert test="@xml:lang">Each root topic element should have an @xml:lang attribute
                assigned.</assert>
        </rule>
    </pattern>
    
    <!-- Each topic should contain a Short Description element -->
    <pattern id="STRUCTURE_03">
        <rule context="concept|task|reference|topic|troubleshooting">
            <assert test="shortdesc|abstract/shortdesc">Each topic should contain a Short
                Description element.</assert>
        </rule>
    </pattern>
    
  • For rules that validate the language layer of your content, ensure to mention the word or phrase that needs to be replaced in a specific node
    Rule example mentioning the word or phrase to be replaced
    <!-- Replace "desire" with "want" -->
    <pattern id="STYLE_19">
        <rule context="text()[matches(.,'[Dd]esire(s)?\W')]">
            <report test=".">Replace "desire" with "want".</report>
        </rule>
    </pattern>
  • Categorize your rules and apply consistent ID values to patterns

    For example, you can use the following categories or come up with your own:

    • Structure
    • Style
    • Grammar
    • Branding
    • Length
    • Numbers

Implementation Guidelines

  • Develop a map with topics that provide tests to trigger each rule
  • Thoroughly test your rules before implementing them into your production instance
  • To maintain a version history of your SCH file with Schematron rules, store the file in a source control repository