1. Home
  2. First Steps
  3. Structure of a TEI-XML document

Structure of a TEI-XML document

A TEI XML document is a collection of tags nested within each other with a very clear and defined hierarchical structure. The entire content of a TEI document is framed by the tag <TEI>:

<TEI> …… </TEI>

It is the only tag that is written entirely in capital letters. The rest of the tags will always be in lower case, or with an internal capital letter if the tag is an abbreviation of two words. The format of each one must always be respected.

Every TEI XML is always divided into two main parts, the header, or <teiHeader> and the text itself, or <text>:

<TEI>

	<teiHeader>
		... Contenidos de la cabecera del documento ...
	</teiHeader>

	<text>
	 	 ... El documento ....
	</text>

</TEI>

<teiHeader>

For full information and instructions on how to complete this section, see the chapter on <teiHeader>.

<text>

Since our editions contain original texts aligned with one or more translations in parallel, each of these versions of the text is itself a <text>. And for a single XML document to contain several <text> it is necessary to group them with the <group>tag. This allows you to add as many translations as you wish, without limit.

Thus, the <text> of all our editions has the following structure:

<text>
	<group>
		<text type="source" xml:lang="la">
			<body>
				... Contenidos de la edición crítica: texto original y aparato crítico ...
			</body>
		</text>
		<text type="translation" xml:lang="es">
			<body>
				... Contenidos de la traducción paralela en español, con comentario en forma de notas ...
			</body>
		</text>
		<text type="translation" xml:lang="en">
			<body>
				... Contenidos de la traducción paralela en inglés, con comentario en forma de notas ...
			</body>
		</text>
	</group>
</text>

Note that there are two attributes that must obligatorily be carried by each of the <text> within the <group>:

  • @type: there are two possibilities: "source" if it is a <text> that contains the original text and its critical apparatus, and "translation", if it contains the translations.
  • @xml:lang: we need to indicate the language in which each of these <text>is written. See the chapter @xml:lang for all the language codes available.

<body>

Each of the <text> that comprise the root <group> of our editions carries a single tag <body>, the body of the document.

The <body> contains the original text, the aparatus, the translations, and the notes that we wish to add to make the commentary.

Each work will have its own internal structure within its <body>: an epic poem does not have the same structure as a tragedy, or a work of prose. To see the different structures of each type of text, cf. the tag

and the sub-chapters therein.

To learn which tags can be used within the <body> in order to compose editions, see the chapter Important Tags.

On how to tag original texts and translations in an aligned way, see the chapter Alignment of original texts and translations.

How can we help?

en_GBEN