UCL logo

UCL Discovery

UCL home » Library Services » Electronic resources » UCL Discovery

Downs and Acrosses: Textual Markup on a Stroke Based Level

Terras, M.; Robertson, P.; (2004) Downs and Acrosses: Textual Markup on a Stroke Based Level. Literary and Linguistic Computing , 19 (3 ) pp.397 - 414 . 10.1093/llc/19.3.397. Green open access

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


Textual encoding is one of the main focuses of Humanities Computing. However, existing encoding schemes and initiatives focus on 'text' from the character level upwards, and are of little use to scholars, such as papyrologists and palaeographers, who study the constituent strokes of individual characters. This paper discusses the development of a markup system used to annotate a corpus of images of Roman texts, resulting in an XML representation of each character on a stroke by stroke basis. The XML data generated allows further interrogation of the palaeographic data, increasing the knowledge available regarding the palaeography of the documentation produced by the Roman Army. Additionally, the corpus was used to train an Artificial Intelligence system to effectively 'read' in stroke data of unknown text and output possible, reliable, interpretations of that text: the next step in aiding historians in the reading of ancient texts. The development and implementation of the markup scheme is introduced, the results of our initial encoding effort are presented, and it is demonstrated that textual markup on a stroke level can extend the remit of marked-up digital texts in the humanities.

Title:Downs and Acrosses: Textual Markup on a Stroke Based Level
Open access status:An open access version is available from UCL Discovery
Publisher version:http://dx.doi.org/doi:10.1093/llc/19.3.397
UCL classification:UCL > School of Arts and Social Sciences > Faculty of Arts and Humanities > Information Studies

View download statistics for this item

Archive Staff Only: edit this record