Title: Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach
Authors: Ehrmann, Maud
Kaše, Vojtěch
Karsdorp, Folgert
Heřmánková, Petra
Wevers, Melvin
Sobotková, Adéla
Andrews, Tara Lee
Burghardt, Manuel
Kestemont, Mike
Manjavacas, Enrique
Piotrowski, Michael
van Zundert, Joris
Citation: KAŠE, V. HEŘMÁNKOVÁ, P. SOBOTKOVÁ, A. Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach. In Ehrmann, M., Karsdorp, F., Wevers, M. Proceedings of the Conference on Computational Humanities Research 2021. Amsterdam: CEUR-WS, 2021. s. 123-135. ISBN: neuvedeno , ISSN: 1613-0073
Issue Date: 2021
Publisher: CEUR-WS
Document type: konferenční příspěvek
ConferenceObject
URI: http://hdl.handle.net/11025/46904
ISBN: neuvedeno
ISSN: 1613-0073
Keywords in different language: Latin inscriptions;document classification;comparative analysis;Roman Empire
Abstract in different language: Large-scale synthetic research in ancient history is often hindered by the incompatibility of tax- onomies used by different digital datasets. Using the example of enriching the Latin Inscriptions from the Roman Empire dataset (LIRE), we demonstrate that machine-learning classification mod- els can bridge the gap between two distinct classification systems and make comparative study possible. We report on training, testing and application of a machine learning classification model using inscription categories from the Epigraphic Database Heidelberg (EDH) to label inscriptions from the Epigraphic Database Claus-Slaby (EDCS). The model is trained on a labeled set of records included in both sources (N=46,171). Several different classification algorithms and parametriza- tions are explored. The final model is based on Extremely Randomized Trees algorithm (ET) and employs 10,055 features, based on several attributes. The final model classifies two thirds of a test dataset with 98% accuracy and 85% of it with 95% accuracy. After model selection and evaluation, we apply the model on inscriptions covered exclusively by EDCS (N=83,482) in an attempt to adopt one consistent system of classification for all records within the LIRE dataset.
Rights: © authors
Appears in Collections:Konferenční příspěvky / Conference papers (KFI)
OBD

Files in This Item:
File SizeFormat 
43933987 Kaše Classifying Latin Inscriptions of the Roman Empire.pdf702,18 kBAdobe PDFView/Open


Please use this identifier to cite or link to this item: http://hdl.handle.net/11025/46904

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

search
navigation
  1. DSpace at University of West Bohemia
  2. Publikační činnost / Publications
  3. OBD