man.charmap.subset.profile

man.charmap.subset.profile — Profile of character map subset

Synopsis

<xsl:param name="man.charmap.subset.profile">
@*[local-name() = 'block'] = 'Miscellaneous Technical' or
(@*[local-name() = 'block'] = 'C1 Controls And Latin-1 Supplement (Latin-1 Supplement)' and
 (@*[local-name() = 'class'] = 'symbols' or
  @*[local-name() = 'class'] = 'letters')
) or
@*[local-name() = 'block'] = 'Latin Extended-A'
or
(@*[local-name() = 'block'] = 'General Punctuation' and
 (@*[local-name() = 'class'] = 'spaces' or
  @*[local-name() = 'class'] = 'dashes' or
  @*[local-name() = 'class'] = 'quotes' or
  @*[local-name() = 'class'] = 'bullets'
 )
) or
@*[local-name() = 'name'] = 'HORIZONTAL ELLIPSIS' or
@*[local-name() = 'name'] = 'WORD JOINER' or
@*[local-name() = 'name'] = 'SERVICE MARK' or
@*[local-name() = 'name'] = 'TRADE MARK SIGN' or
@*[local-name() = 'name'] = 'ZERO WIDTH NO-BREAK SPACE'
</xsl:param>

Description

If the value of the man.charmap.use.subset parameter is non-zero, and your DocBook source is not written in English (that is, if the lang or xml:lang attribute on the root element in your DocBook source or on the first refentry element in your source has a value other than en), then the character-map subset specified by the man.charmap.subset.profile parameter is used instead of the full roff character map.

Otherwise, if the lang or xml:lang attribute on the root element in your DocBook source or on the first refentry element in your source has the value en or if it has no lang or xml:lang attribute, then the character-map subset specified by the man.charmap.subset.profile.english parameter is used instead of man.charmap.subset.profile.

The difference between the two subsets is that man.charmap.subset.profile provides mappings for characters in Western European languages that are not part of the Roman (English) alphabet (ASCII character set).

The value of man.charmap.subset.profile is a string representing an XPath expression that matches attribute names and values for output-character elements in the character map.

The attributes supported in the standard roff character map included in the distribution are:

character
a raw Unicode character or numeric Unicode character-entity value (either in decimal or hex); all characters have this attribute
name
a standard full/long ISO/Unicode character name (e.g., "OHM SIGN"); all characters have this attribute
block
a standard Unicode "block" name (e.g., "General Punctuation"); all characters have this attribute. For the full list of Unicode block names supported in the standard roff character map, see the section called “Supported Unicode block names and "class" values”.
class
a class of characters (e.g., "spaces"). Not all characters have this attribute; currently, it is used only with certain characters within the "C1 Controls And Latin-1 Supplement" and "General Punctuation" blocks. For details, see the section called “Supported Unicode block names and "class" values”.
entity
an ISO entity name (e.g., "ohm"); not all characters have this attribute, because not all characters have ISO entity names; for example, of the 800 or so characters in the standard roff character map included in the distribution, only around 300 have ISO entity names.
string
a string representing an roff/groff escape-code (with "@esc@" used in place of the backslash), or a simple ASCII string; all characters in the roff character map have this attribute

The value of man.charmap.subset.profile is evaluated as an XPath expression at run-time to select a portion of the roff character map to use. You can tune the subset used by adding or removing parts. For example, if you need to use a wide range of mathematical operators in a document, and you want to have them converted into roff markup properly, you might add the following:

  @*[local-name() = 'block'] ='MathematicalOperators' 

That will cause a additional set of around 67 additional "math" characters to be converted into roff markup.

Note

Depending on which XSLT engine you use, either the EXSLT dyn:evaluate extension function (for xsltproc or Xalan) or saxon:evaluate extension function (for Saxon) are used to dynamically evaluate the value of man.charmap.subset.profile at run-time. If you don't use xsltproc, Saxon, Xalan -- or some other XSLT engine that supports dyn:evaluate -- you must either set the value of the man.charmap.use.subset parameter to zero and process your documents using the full character map instead, or set the value of the man.charmap.enabled parameter to zero instead (so that character-map processing is disabled completely.

An alternative to using man.charmap.subset.profile is to create your own custom character map, and set the value of man.charmap.uri to the URI/filename for that. If you use a custom character map, you will probably want to include in it just the characters you want to use, and so you will most likely also want to set the value of man.charmap.use.subset to zero.

You can create a custom character map by making a copy of the standard roff character map provided in the distribution, and then adding to, changing, and/or deleting from that.

Caution

If you author your DocBook XML source in UTF-8 or UTF-16 encoding and aren't sure what OSes or environments your man-page output might end up being viewed on, and not sure what version of nroff/groff those environments might have, you should be careful about what Unicode symbols and special characters you use in your source and what parts you add to the value of man.charmap.subset.profile.

Many of the escape codes used are specific to groff and using them may not provide the expected output on an OS or environment that uses nroff instead of groff.

On the other hand, if you intend for your man-page output to be viewed only on modern systems (for example, GNU/Linux systems, FreeBSD systems, or Cygwin environments) that have a good, up-to-date groff, then you can safely include a wide range of Unicode symbols and special characters in your UTF-8 or UTF-16 encoded DocBook XML source and add any of the supported Unicode block names to the value of man.charmap.subset.profile.

For other details, see the documentation for the man.charmap.use.subset parameter.

Supported Unicode block names and "class" values

Below is the full list of Unicode block names and "class" values supported in the standard roff stylesheet provided in the distribution, along with a description of which codepoints from the Unicode range corresponding to that block name or block/class combination are supported.