Brief Technical Documentation
This is the Brief Technical Documentation as it was published in September 2013 with the publication of the second module: L'Innommable / The Unnamable. The document in its very first state
(published on 24/06/2011) can be found here. The document in its current version can be found here.
The electronic edition is encoded in XML (eXtensible Markup Language).
The encoding design started from the P5 version of the Guidelines of the TEI (Text Encoding
Initiative) and expands the DTD (Document Type Definition) with project-specific tags and
attributes where needed.[1] The L'Innommable/The
Unnamable module is based on version 2.3.0. of TEI P5[2], while the first module
published in 2011 is based on version 1.0.0.[3] The tagset of the first
module also incorporated some tags from a working document of the TEI SIG "Manuscripts".[4] A number of these
proposed tags were introduced into TEI P5 version 2.0.0.[5], some in modified form.
Because of these modifications and with the following modules in mind, we decided to comply
with the P5 version that was current at the time the xml transcriptions of this module were
finished (june 2013).[6]
The encoding is based on the definitions of crucial notions such as
'document', 'text', 'version', and 'work' by Peter Shillingsburg in his book Scholarly Editing in the Computer Age, notably the chapter entitled 'Ontology'.
The edition is published in a Java framework. [7]
Metadata
The header contains metadata such as a title statement and
publication statement, mentioning the coordinates of the Centre for Manuscript Genetics
(University of Antwerp); a brief source description; and a profile description with
information on the languages and the handwritings in the document.
Structural tags
<text>: This tag is used to indicate
'the actual order of words and punctuation as contained in any one physical form' (Shillingsburg 1996: 46). The
physical form is paper and ink. As a physical vessel the document contains only one
text, but it may contain more than one version of more than one work - a 'work' being
'the message or experience implied by the authoritative versions of a literary writing'
(Shillingsburg 1996: 176).
The archive catalogue number serves as unique id.
<div>: Used without attributes, this
tag indicates a version, i.e. 'one specific form of the work - the one the author
intended at some particular moment in time' (Shillingsburg 1996: 44). The writing layers are
indicated by means of <del> (deletion) and <add> (addition) tags.
<div type="paralipomena">: Apart from
versions, a document may also contain fragments of text (jottings, notes, reflections,
try-out sentences, and so forth) which strictly speaking do not belong to a version of a
work. These paralipomena are indicated by means of the tag <div
type="paralipomena">.
<p>: Versions or paralipomena may
consist of several paragraphs.
<seg>: Each paragraph usually consists
of several sentences. When Beckett did not work with full sentences (e.g. in Not I / Pas moi) the segment consists of a few lines of text, i.e.
a unit of text that can easily be compared to other versions.
<anchor type="subsentence"/>: In
L'Innommable/The Unnamable, there are a number of very long sentences. Too
long in fact, to allow for a sentence by sentence comparison. These sentences have been
subdivided into two or more "subsentences" by means of anchors in the text.
Global attributes
xml:id
The xml:id of the <text> tag is the document's archive number.
According to Peter Shillingsburg's definition, the variant forms of a work usually have
the same name, but in some cases 'there will be disagreement over whether a variant form
is in fact a variant version or a separate work' (176).
xml:lang
This attribute indicates the language in which the version is
written.
n
The catalogue number is followed by the number of the sentence in the
base text (see chapter "base
texts"):
<seg n="MS-UoR-2934,[0127]">
In the case of a sentence that eventually did not make it into the
base text, the number of the preceding sentence that did make it
into the base text is followed by | and an extra number:
<seg n="MS-UoR-2934,[0127|001]">
The first number always consists of 4 digits: 0001 and so on; the
second number, after the |, always consists of 3 digits. In the visualization, this
extra sentence (or phrase) appears in bold, because it constitutes a deviation from the
base text.
version
This attribute indicates the chronological order of the versions of a
textual unit (section, paragraph, sentence).
In L'Innommable/The Unnamable, the chronology of versions largely corresponds to the chronology of the documents.[8] Only in cases where there is more than one version of the same sentence within one document and where the order of writing does not correspond to the documentary order, a version attribute has been added to the <seg> tag to encode the correct chronology.
In Stirrings Still/Soubresauts, the chronology is a lot more complex and version attributes have been added to all sections, paragraphs and sentences.
In L'Innommable/The Unnamable, the chronology of versions largely corresponds to the chronology of the documents.[8] Only in cases where there is more than one version of the same sentence within one document and where the order of writing does not correspond to the documentary order, a version attribute has been added to the <seg> tag to encode the correct chronology.
In Stirrings Still/Soubresauts, the chronology is a lot more complex and version attributes have been added to all sections, paragraphs and sentences.
In the case of partial versions the version number is followed by a
letter (e.g. typescript version 12 of Stirrings Still/Soubresauts contains a
redraft of its last paragraph; this redrafted paragraph is indicated by the number 12a).
zone
<seg> tags have a zone attribute which
holds the name(s) of the zone(s) on a page in the image / text feature that the sentence
is a part of.
section
(only applied in Stirrings
Still/Soubresauts)
In its published form, Stirrings Still/Soubresauts
consists of three sections, numbered 1, 2, and 3. Whenever a <div> can be identified as an early version of one of these three
sections, it is followed by the section number 1, 2, or 3.
<div section="1">
<div section="2">
<div section="3">
A few blocks of text on the extant documents cannot be identified
with any of the three sections, and yet they are more than just loose jottings or
paralipomena. The author developed them in several versions, until he decided that
this was a dead end. In the case of Stirrings Still there are
3 abandoned sections. They are also referred to as <div>s but the following section number
is preceded by a zero:
<div section="01">
<div section="02">
<div section="03">
time
(only applied in Stirrings
Still/Soubresauts)
The 'version' attribute indicates the chronological sequence of
versions of one single section, whereas the 'time' attribute indicates the
chronological sequence of all the <div>s,
irrespective of the sections.
chrono
(only applied in Stirrings
Still/Soubresauts)
Since there are 3 sections and 3 abandoned sections, there are 6
sections in all; the 'chrono' attribute indicates their chronological order, whether
they made it into the published version or not.
trans / orig
(only applied in Stirrings
Still/Soubresauts and Comment dire/ what is the word)
These attributes are only relevant for bilingual works: if a
version was translated by the author (sometimes already during the writing process)
the source text is encoded with the attribute 'orig' and the target text with the
attribute 'trans', both followed by the same number relating them to each other. For
instance, the 18th version of section 1 is a translation of version 17. It is the
third translation in the genetic dossier, hence the code orig="03" and trans="03".
<div section="1" chrono="4" version="17" xml:lang="EN" orig="03">
<div section="1" chrono="4" version="18" xml:lang="FR" trans="03">
The attributes are combined to allow readers to retrieve the
transcripts from different perspectives:
The documents can be studied in the order of their catalogue
numbers.
This option only requires the xml:id of the <text> tag:
<text xml:id="MS-UoR-2933-1">
A chronological approach rearranges the transcripts of the drafts
in the order of their composition. This option is a combination of the section, time
and ana attributes in the <div> tag:
<div section="1" chrono="4" version="17" xml:lang="EN" orig="03" time="45">
The rearrangement per language shows that Beckett often switched
between French and English during the writing process.
<div section="1" chrono="4" version="17" xml:lang="EN" orig="03" time="45">
Translations (i.e. authorial translations) are distinguished from
versions that were written directly in the target language:
<div section="1" chrono="4" version="17" xml:lang="EN" orig="03" time="45">
<div section="1" chrono="4" version="18" xml:lang="FR" trans="03" time="48">
The 'Compare versions' approach arranges the versions that did make it into the published text according to their position
in the narrative structure. For this option the section and
version-attribute suffice:
<div section="1" chrono="4" version="17" xml:lang="EN" orig="03" time="45">
The textual material can be rearranged from several perspectives
(see menu) by combining different attributes:
Documents | <text> xml:id |
Chronology | <div> section + time + chrono |
Language | <div> section + version + xml:lang |
Translations | <div> section + trans/orig |
Compare versions | <div> section + version |
<p> section + version | |
<seg> n + version |
The numbering of the sections, paragraphs and sentences enables
the user to adapt the size of the textual unit s/he wishes to compare.
<div section="1" chrono="4" version="17">
<p section="1.2" version="17">
<seg n="MS-UoR-2933-1,[0055]" version="17">
Textual Alterations
The most frequently occurring tags in the XML transcriptions are
deletions and additions:
<del>: For each cancelled phrase the
type of cancellation, the author of the cancellation, and the writing tools are
indicated, as well as the person responsible for the transcription (the editor):
<del type="crossOut" hand="#SB" rend="black ink" resp="#DVH">...</del>
In the case of instant alterations (currente
calamo) the type attribute value is 'instant correction'. Instant corrections
are only marked if there is no doubt that the cancellation cannot have been introduced
at a later stage: for instance in the sentence 'perhaps not again never
to be heard again', 'not again' is
marked as being followed by an instant correction; 'to be heard' is not, because the
cancellation may have been introduced at a later stage.
<delSpan spanTo="#anchor"/>: For
passages cancelled by Beckett or 'marked as used', three types can be distinguished:
heavily crossed out, a diagonal line or a St. Andrew's cross.
<add>: For additions the place of the
addition is also indicated:
<add place="marginleft" hand="#SB" rend="black ink" resp="#DVH">...</add>
The place indications used in the present edition are:
'marginleft,' 'marginright,' 'margintop,' 'marginbottom,'
'facingleaf,' 'inline,' 'supralinear,' 'infralinear,' 'overwritten.'
Open variants: alternative readings
Open variants have been marked up in this way:
when at last out again <seg type="alternative" xml:id="alt1">he knew not</seg>
<add place="above" type="alternative" xml:id="alt2">no knowing</add>
Transpositions
Transpositions (when the author moves blocks of text to a different
position, using arrows, asterisks, numbers or lines) have been marked up in this
way:
where he sits <seg type="transposition" xml:id="trans1">at his table</seg> <seg type="transposition" xml:id="trans2">head on hands</seg>.
All transpositions are declared in the header:
<listTranspose>
<transpose>
<ptr target="#trans2"/>
<ptr target="#trans1"/>
</transpose>
</listTranspose>
Metamarks
Passages or signs that, strictly speaking, do not belong to the
version: paralipomena, dates and place names, numberings, stamps and 'metamarks',
defined by the TEI as 'any kind of graphic or written signal within a document the
function of which is to determine how it should be read rather than forming part of the
actual content of the document'. In the BDMP these features are encoded as follows:
<metamark>: indicates metamarks, such as 'Stet' as a way of undoing a cancellation; or, for instance, two corresponding instances of the letter 'A' indicating where an addition is to be inserted.
<metamark>: indicates metamarks, such as 'Stet' as a way of undoing a cancellation; or, for instance, two corresponding instances of the letter 'A' indicating where an addition is to be inserted.
<stamp>: indicates a stamp of the holding library.
<num>: indicates the page number as it is presented on
the page. A 'type' attribute specifies whether these numbers were prenumbered in the
notebook, written by Beckett, or added by an archivist.
<floatingText>: An archive number that was written on a
document by the archivist, is seen as a floatingText, as defined by TEI.
<date>: to encode dates.
(only applied in Stirrings
Still/Soubresauts)
Variants
Genetic Variants (rewritings)
Rewritings (variants between the 'top layer' of different
versions in the genetic dossier) are marked by means of <rdg> tags.
Translation variants
Mismatches between the English and French are marked by means
of <rdg> tags with a 'type' attribute value 'trans'. The absence
of a word or word string that appears in the corresponding translation or
original is indicated by means of a rend attribute,
mentioning the absence:
<rdg type="trans" rend="absence"/>
In the BDMP this absence is visualized by means of a vertical
bar.
Notes:
[1] "pb", "stemma", "time",
"section", "trans", "orig", "textn", "over" and "chrono" have been added to the global
attributes. The attributes "version" and "zone" have been added to the tags <text>, <div>, <p>, <sp>, <l>, <lg>, <stage> and <seg>. A <sub> tag has been added.
[2] Published on 17/1/2013
(http://www.tei-c.org/Vault/P5/2.3.0/doc/tei-p5-doc/en/html/).
[3] Published on 2/11/2007
(http://www.tei-c.org/Vault/P5/1.0.0/doc/tei-p5-doc/en/html/).
[4] TEI SIG website: http://www.tei-c.org/Activities/SIG/Manuscript/.
[5] Published on 16/12/2011
(http://www.tei-c.org/Vault/P5/2.0.0/doc/tei-p5-doc/en/html/).
[6] The differences between
the two versions of TEI P5 come down to these differences between L'Innommable /
The Unnamable and Stirrings Still / Soubresauts: <metamark> vs. <ge:metamark>, <listTranspose> vs. <ge:transposeGrp>, <transpose> vs. <ge:transpose>, <handNotes> and <handNote>
vs. <handList> and <hand>.
[7] The edition is
published as a Cocoon webapplication inside the Apache Tomcat servlet container (http://tomcat.apache.org/). The search engine makes use of elasticsearch (https://www.elastic.co/).
[8] The chronology of the
sentence versions relating to the first 24 pages of the first English typescript of
The Unnamable differs from the chronology of the rest of the text. A more
detailed analysis is made under "Chronology" in the L'Innommable / The Unnamable
module.
© 2025 Samuel Beckett Digital Manuscript Project
Directors: Dirk Van Hulle and Mark Nixon | Technical realisation: Vincent
Neyt