SILCN 1.0

Selection, Identification and Location of Common Nodes

Prepared by:
Alex Brown, Griffin Brown Digital Publishing Ltd. (alexb@griffinbrown.co.uk)
Andrew Sales, Griffin Brown Digital Publishing Ltd. (andrews@griffinbrown.co.uk)
Nandini Das, Griffin Brown Digital Publishing Ltd. (nandinid@griffinbrown.co.uk)

Status of this document:
Draft

Summary

SILCN (pronounced 'silken') is a language that describes a flexible, lightweight framework for selecting, identifying and locating sets of common nodes in an XML document.

The SILCN language consists of two parts:

A SILCN processor must accept XML documents that conform to the specification in Part 1, and apply them, as specified here, to another XML document in order to emit an XML document that conforms to the specification in Part 2.

Background

SILCN emerged from commercial work undertaken to provide bespoke validation applications for XML documents. Buyers of these applications needed to apply tests to XML documents that could not be performed by existing XML constraint specification languages such as XML DTDs or existing schema languages.

Initially such applications were developed one by one using XML APIs (principally SAX [ref]). SILCN has emerged from a lengthy process of moving from this development model to one in which XML (rather than a conventional software development language) could be used to specify tests not available using DTDs or schema languages.

This specification has been reverse-engineered from an application of the technology as an XML quality assurance tool (in effect, a schema language). In practice, SILCN be used in any application where it is necessary to select, identify and process specific sets of common nodes in an XML document.


Terminology

node
Any phenomenon of an XML document expressible as an Information Item, as defined in Section 2 ("Information Items") of W3C XML-InfoSet.
well-balanced
As defined in Section 3 ("Terminology") of W3C XML-Fragments.

Note:
The key words "MUST", "MUST NOT" and "MAY" in this document are to be interpreted as described in [RFC2119].

Normative references

W3C XML, Extensible Markup Language (XML) 1.0 (Second Edition), W3C Recommendation, 6 October 2000, available at http://www.w3.org/TR/2000/REC-xml-20001006.

W3C XML-InfoSet, XML Information Set, W3C Recommendation 24 October 2001, available at http://www.w3.org/TR/xml-infoset/.

W3C XML-Names, Namespaces in XML, W3C Recommendation, 14 January 1999, available at http://www.w3. org/TR/1999/REC-xml-names-19990114.

W3C XML-Fragments, XML Fragment Interchange, W3C Candidate Recommendation 12 February 2001, available at http://www.w3.org/TR/xml-fragment.

RFC 2119 [Keywords], Key words for use in RFCs to Indicate Requirement Levels, S. Bradner. Best Current Practice, March 1997, available at http://www.ietf.org/rfc/rfc2119.txt.


General language features

All SILCN documents are well-formed XML documents conforming to W3C XML. All SILCN elements specified here are in the XML Namespace http://silcn.org/200309, in conformance to W3C XML-Names.

SILCN has a grammar defined with a so-called 'open schema', meaning the content models for some SILCN elements allow 'points of opportunity' for the insertion of well-balanced application-specific XML content. Any XML elements in such content must not be within the SILCN Namespace.

Structures common to both Parts of the SILCN language

Some structures are common to both the parts of the SILCN language specification:

Root element: silcn

All SILCN documents must have this root element.

SILCN version: version

The content of the version element identifies the version of SILCN that applies to any SILCN document. For SILCN 1.0 this element's content must be the literal string 1.0.

Expressions language specification: expression-language-declaration element

SILCN does not specify that any particular expression language be used for expressing either selection criteria, or location information, for nodes in an XML document. The expression language to use is given indirectly using the expression-language-declaration element. Its content specifies an expression language that the content of any expression elements will conform to.

The expression-language-declaration element contains an initial name element, which names the expression language, followed by any well-balanced content.

Namespace prefix binding: namespace-declaration element

In SILCN documents an XML Namespace URI may be bound to a Namespace prefix for use in embedded XML processing languages.

To do this the namespace-declaration element is used: its two child elements (prefix and uri) are used to make the binding.

Identifiers: id element

The id element is used within SILCN documents to model identifier values. In any SILCN document, no two id elements may have the same content.

Expressions: expression element

The expression element is used within SILCN documents to model expressions used for specifying either selection criteria (Part 1) or reporting locations (Part 2) of nodes.


Part 1: selection language

selection element

A document conforming to Part 1 of the SILCN ('selection language') must have one or more elements named selection immediately following its version element.

Part 1 structure

Each selection element must have as its first child a expression-language-declaration element, which specifies the expression languages used within that selection element. This is optionally followed by one or more namespace-declaration elements, which define namespace prefixes which may be used within that selection element.

The remainder of the element must consist of one ore more set-criterion elements.

Specifying selection criteria: set-criterion

Each set-criterion specifies a criterion by which nodes of a processed document will be selected. The set-criterion element has two mandatory children:

  1. id - uniquely identifies the criterion;
  2. expression - gives an expression, conforming to the language specified in the preceding expression-language element, which can be applied to target documents for node selection.

RELAX NG schema for SILCN



default namespace = "http://silcn.org/200309"

start =
  element silcn {
    element version { text },
    any-well-balanced,
    (selection-model | report-model)+
  }
selection-model =
  element selection {
    expression-language-declaration-model,
    namespace-declaration-model*,
    any-well-balanced,
    set-criterion-model+
  }
set-criterion-model =
  element set-criterion {
    element id { text },
    element expression { any-well-balanced },
    any-well-balanced
  }
report-model =
  element report {
    expression-language-declaration-model,
    namespace-declaration-model*,
    any-well-balanced,
    matched-set-model*
  }
matched-set-model =
  element matched-set {
    element id { text },
    element node {
      element expression { any-well-balanced },
      any-well-balanced
    }+
  }
namespace-declaration-model =
  element namespace-declaration {
    element uri { text },
    element prefix { text }
  }
expression-language-declaration-model =
  element expression-language-declaration {
    element name { text },
    any-well-balanced
  }
# this model matches any well-balanced XML, or none
any-well-balanced =
  (text
   | element * {
       (attribute * { text }
        | text
        | any-well-balanced)*
     })*

Example SILCN Part 1 ("selection") instance

For the purposes of illustration, application-specific elements such as may appear (but are not part of the SILCN specification) are included in this example in a blue typeface.



<?xml version='1.0'?>

<silcn:silcn xmlns:silcn='http://silcn.org/200309'>

<silcn:version>1.0</silcn:version>

<silcn:selection>

<silcn:expression-language-declaration>
 <silcn:name>XPath</silcn:name>
 <version xmlns='http://foo.org'>1.0</version>
</silcn:expression-language-declaration>

<silcn:namespace-declaration>
 <silcn:uri>http://www.w3.org/1999/xhtml</silcn:uri>
 <silcn:prefix>xh</silcn:prefix>
</silcn:namespace-declaration>

<silcn:set-criterion>
 <silcn:id>40010</silcn:id>
 <silcn:expression>//xh:img[not(@alt)]
            |//xh:input[not(@alt)]
            |//xh:applet[not(@alt)]</silcn:expression>
 <msg>Element <eval>name()</eval> should have an alt attribute.</msg>
</silcn:set-criterion>

</silcn:selection>

</silcn:silcn>


Part 2: report language

report element

A document conforming to Part 2 of the SILCN ('report language') must have a silcn root element, which has mandatory first child elements of type version. This may be followed by any well-balanced XML content, and this in turn must be followed by one or more elements named report.

report structure

A report element must have as its first child an expression-language-declaration as detailed above. This is optionally followed by one or more namespace-declaration elements, which define namespace prefixes which may be used within that report element. This is optionally followed by any well-balanced content.

The remainder of the report element contains one or more matched-set elements.

Reporting on sets of matched nodes: matched-set

Each matched-set specifies a set of nodes which conform to a single criterion specified in a document conforming to the selection language. Its child elements are:

  1. id - gives a unique identifier for the criterion on which this set has matched;
  2. node (repeatable) - gives information for each node in the set so matched.

Each node element must contain as its first child an expression element, which expresses the location of the matched node, in a language conforming to that specified by a preceding expression-language-declaration element, within the ancestor report element.

Each expression element may be followed by any well-balanced XML content.

Example SILCN Part 2 ("report") instance

For the purposes of illustration, application-specific elements such as may appear (but are not part of the SILCN specification) are included in this example in a blue typeface.



<?xml version='1.0'?>

<silcn:silcn xmlns:silcn='http://silcn.org/200309'>

<silcn:version>1.0</silcn:version>

<document-uri>file:///c:/test/testdoc.htm</document-uri>
<run-date>2003-09-23</run-date>
<execution-time>5087</execution-time>

<silcn:report>

<silcn:expression-language-declaration>
 <silcn:name>XPath</silcn:name>
</silcn:expression-language-declaration>

<silcn:namespace-declaration>
 <silcn:uri>http://www.w3.org/1999/xhtml</silcn:uri>
 <silcn:prefix>xh</silcn:prefix>
</silcn:namespace-declaration>

<silcn:matched-set>
 <silcn:id>40010</silcn:id>
 <silcn:node>
  <silcn:expression>//xh:html/xh:body/xh:p/xh:img[1]</silcn:expression>
  <msg>Element <eval>img</eval> should have an alt attribute.</msg>
 </silcn:node>
 <silcn:node>
  <silcn:expression>//xh:html/xh:body/xh:form/xh:input[2]</silcn:expression>
  <msg>Element <eval>input</eval> should have an alt attribute.</msg>
 </silcn:node>
</silcn:matched-set>

</silcn:report>

</silcn:silcn>