Tangle

Tangle Part of Literate Programming in XML 05 Oct 2001 $Id: tangle.xweb,v 1.3 2001/10/06 13:39:05 nwalsh Exp $ 0.1 05 Oct 2001 ndw Initial draft. NormanWalsh The tangle.xsl stylesheet transforms an &xweb; document into a source code document. This is a relatively straightforward process: starting with the top fragment, all of the source fragments are simply stitched together, discarding any intervening documentation. The resulting tangled document is ready for use by the appropriate processor.

The Stylesheet This &xweb; document contains the source for two stylesheets, tangle.xsl and xtangle.xsl. Both stylesheets produce tangled sources, the latter is a simple customization of the former for producing XML vocabularies. Each of these stylesheets performs some initialization, sets the output method appropriately, begins processing at the root template, and processes fragments, copying the content appropriately.

The <filename>tangle.xsl</filename> Stylesheet The tangle stylesheet produces text output.

The <filename>xtangle.xsl</filename> Stylesheet The xtangle stylesheet produces XML output.

Initialization The stylesheet initializes the processor by loading its version information (stored in a separate file because it is shared by several stylesheets) and telling the processor to preserve whitespace on all input elements. The stylesheet also constructs a key for the ID values used on fragments. Because &xweb; documents do not have to be valid according to any particular DTD or Schema, the stylesheet cannot rely on having the IDs identified as type ID in the source document.

The Root Template The root template begins processing at the root of the &xweb; document. It outputs a couple of informative comments and then directs the processor to transform the src:fragment element with the $top ID. Source code fragments in the &xweb; document are not required to be sequential, so it is necessary to distinguish one fragment as the primary starting point.

Processing Fragments In order to tangle an &xweb; document, we need only copy the contents of the fragments to the result tree. Processing src:fragment elements is easy, simply copy their children:

Copying Elements Copying elements to the result tree can be divided into four cases: copying passthrough elements, copying fragment references, and copying everything else.

Copying <sgmltag>src:passthrough</sgmltag> Passthrough elements contain text that is intended to appear literally in the result tree. We use XSLT disable-output-escaping to copy it without interpretation:

Copying <sgmltag>src:fragref</sgmltag> With a unique exception, copying fragment references is straightforward: find the fragment that is identified by the cross-reference and process it. The single exception arises only in the processing of src:fragref elements in the weave.xweb document. There is a single template in the weave program that needs to copy a literal src:fragref element to the result tree. That is the only time the branch is executed.

Copying Normal Fragment References To copy a normal fragment reference, identify what the linkend attribute points to, make sure it is valid, and process it.

Fragment is Unique Make sure that the linkend attribute points to exactly one node in the source tree. It is an error if no element exists with that ID value or if more than one exists. Link to fragment "

" does not uniquely identify a single fragment.

Fragment is a <sgmltag>src:fragment</sgmltag> Make sure that the linkend attribute points to a src:fragment element. FIXME: this code should test the namespace name of the $fragment Link "

" does not point to a src:fragment.

Copying Disable-Output-Escaping Fragment References A src:fragref that specifies disable-output-escaping is treated essentially as if it was any other element. The only exception is that the disable-output-escaping attribute is not copied. Because tangle and weave are XSLT stylesheets that process XSLT stylesheets, processing src:fragref poses a unique challenge. In ordinary tangle processing, they are expanded and replaced with the content of the fragment that they point to. But when weave.xweb is tangled, they must be copied through literally. The disable-output-escaping attribute provides the hook that allows this.

Copying Everything Else Everything else is copied verbatim. This is a five step process: Save a copy of the context node in $node so that we can refer to it later from inside an xsl:for-each. Construct a new node in the result tree with the same qualified name and namespace as the context node. Copy the namespace nodes on the context node to the new node in the result tree. We must do this manually because the &xweb; file may have broken the content of this element into several separate fragments. Breaking things into separate fragments makes it impossible for the XSLT processor to always construct the right namespace nodes automatically. Copy the attributes. Copy the children.

For non-XML source docuements, this template will never match because there will be no XML elements in the source fragments.

Copy Namespaces Copying the namespaces is a simple loop over the elements on the namespace axis, with one wrinkle. It is an error to copy a namespace node onto an element if a namespace node is already present for that namespace. The fact that we're running this loop in a context where we've constructed the result node explicitly in the correct namespace means that attempting to copy that namespace node again will produce an error. We work around this problem by explicitly testing for that namespace and not copying it.

Copy XML Constructs In the xtangle.xsl stylesheet, we also want to preserve XML constructs (processing instructions and comments) that we encounter in the fragments. Note that many implementations of XSLT do not provide comments in the source document (they are discarded before building the tree), in which case the comments cannot be preserved.