In this section we detail the characteristics of the text we are editing. An example that has been filled out:
<textDesc n="epic poetry"> <channel mode="w"/> <constitution type="single"/> <derivation type="original"/> <domain type="art"/> <factuality type="fiction"/> <interaction type="none"/> <preparedness type="prepare"/> <purpose type="entertain" degree="high"/> </textDesc>
The options for the name
@n in our classification of the
"history". We can add other variants as the corpus grows.
The tags contained in
<textDesc> do not carry a node, only attributed data, which are predefined and need to be selected manually. Nodes can be added if any clarification is required.
<channel>: the possible values of the transmission channel are:
- s (spoken)
- w (written)
- sw (spoken to be written) e.g. dictation
- ws (written to be spoken) e.g. a script
- m (mixed)
- x (unknown or inapplicable)
<constitution>: indicates the state in which the text is preserved:
- single – a single complete text
- composite – a text made by combining several smaller items, each individually complete
- frags – (fragments) a text made by combining several smaller, not necessarily complete, items
- unknown – composition unknown or unspecified
<derivation>: nature and degree of originality of the text:
- original – text is original
- revision – text is a revision of some other text
- translation – text is a translation of some other text
- abridgment – text is an abridged version of some other text
- plagiarism – text is plagiarized from some other text
- traditional – text has no obvious source but is one of a number derived from some common ancestor
<domain>: social context for which the text was composed or created:
- art – art and entertainment
- domestic – domestic and private
- religious – religious and ceremonial
- business – business and work place
- education – education
- govt – (government) government and law
- public – other forms of public context
<factuality>: the extent to which the text is fiction or non-fiction:
- fiction – the text is to be regarded as entirely imaginative
- fact – the text is to be regarded as entirely informative or factual
- mixed – the text contains a mixture of fact and fiction
- inapplicable – the fiction/fact distinction is not regarded as helpful or appropriate to this text
<purpose>: indicates the purpose or communicative function of the text. Attributes can be free, e.g.:
- persuade – didactic, advertising, propaganda, etc.
- express – self expression, confessional, etc.
- inform – convey information, educate, etc.
- entertain – amuse, entertain, etc.
All this data is relevant when classifying documents, and will vary slightly from one work to another, so the editor should complete it carefully.