1. Home
  2. Header of each document:
  3. <profileDesc>
  4. <textDesc>

<textDesc>

In this section we detail the characteristics of the text we are editing. An example that has been filled out:

<textDesc n="epic poetry">
	<channel mode="w"/>
	<constitution type="single"/>
	<derivation type="original"/>
	<domain type="art"/>
	<factuality type="fiction"/>
	<interaction type="none"/>
	<preparedness type="prepare"/>
	<purpose type="entertain" degree="high"/>
</textDesc>

The options for the name @n in our classification of the <textDesc> are: "epic poetry", "tragedy", "mythography", "history". We can add other variants as the corpus grows.

The tags contained in <textDesc> do not carry a node, only attributed data, which are predefined and need to be selected manually. Nodes can be added if any clarification is required.

  • <channel>: the possible values of the transmission channel are:
    • (spoken) 
    • (written) 
    • sw (spoken to be written) e.g. dictation
    • ws (written to be spoken) e.g. a script
    • (mixed) 
    • (unknown or inapplicable)
  • <constitution>: indicates the state in which the text is preserved:
    • single – a single complete text
    • composite – a text made by combining several smaller items, each individually complete
    • frags – (fragments) a text made by combining several smaller, not necessarily complete, items
    • unknown – composition unknown or unspecified
  • <derivation>: nature and degree of originality of the text:
    • original – text is original
    • revision – text is a revision of some other text
    • translation – text is a translation of some other text
    • abridgment – text is an abridged version of some other text
    • plagiarism – text is plagiarized from some other text
    • traditional – text has no obvious source but is one of a number derived from some common ancestor
  • <domain> : social context for which the text was composed or created:
    • art – art and entertainment
    • domestic – domestic and private
    • religious – religious and ceremonial
    • business – business and work place
    • education – education
    • govt – (government) government and law
    • public – other forms of public context
  • <factuality>: the extent to which the text is fiction or non-fiction:
    • fiction – the text is to be regarded as entirely imaginative
    • fact – the text is to be regarded as entirely informative or factual
    • mixed – the text contains a mixture of fact and fiction
    • inapplicable – the fiction/fact distinction is not regarded as helpful or appropriate to this text
  • <purpose>: indicates the purpose or communicative function of the text. Attributes can be free, e.g.:
    • persuade – didactic, advertising, propaganda, etc.
    • express – self expression, confessional, etc.
    • inform – convey information, educate, etc.
    • entertain – amuse, entertain, etc.

All this data is relevant when classifying documents, and will vary slightly from one work to another, so the editor should complete it carefully.

How can we help?

en_GBEN