Název: Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach
Autoři: Ehrmann, Maud
Kaše, Vojtěch
Karsdorp, Folgert
Heřmánková, Petra
Wevers, Melvin
Sobotková, Adéla
Andrews, Tara Lee
Burghardt, Manuel
Kestemont, Mike
Manjavacas, Enrique
Piotrowski, Michael
van Zundert, Joris
Citace zdrojového dokumentu: KAŠE, V. HEŘMÁNKOVÁ, P. SOBOTKOVÁ, A. Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach. In Ehrmann, M., Karsdorp, F., Wevers, M. Proceedings of the Conference on Computational Humanities Research 2021. Amsterdam: CEUR-WS, 2021. s. 123-135. ISBN: neuvedeno , ISSN: 1613-0073
Datum vydání: 2021
Nakladatel: CEUR-WS
Typ dokumentu: konferenční příspěvek
URI: http://hdl.handle.net/11025/46904
ISBN: neuvedeno
ISSN: 1613-0073
Klíčová slova v dalším jazyce: Latin inscriptions;document classification;comparative analysis;Roman Empire
Abstrakt v dalším jazyce: Large-scale synthetic research in ancient history is often hindered by the incompatibility of tax- onomies used by different digital datasets. Using the example of enriching the Latin Inscriptions from the Roman Empire dataset (LIRE), we demonstrate that machine-learning classification mod- els can bridge the gap between two distinct classification systems and make comparative study possible. We report on training, testing and application of a machine learning classification model using inscription categories from the Epigraphic Database Heidelberg (EDH) to label inscriptions from the Epigraphic Database Claus-Slaby (EDCS). The model is trained on a labeled set of records included in both sources (N=46,171). Several different classification algorithms and parametriza- tions are explored. The final model is based on Extremely Randomized Trees algorithm (ET) and employs 10,055 features, based on several attributes. The final model classifies two thirds of a test dataset with 98% accuracy and 85% of it with 95% accuracy. After model selection and evaluation, we apply the model on inscriptions covered exclusively by EDCS (N=83,482) in an attempt to adopt one consistent system of classification for all records within the LIRE dataset.
Práva: © authors
Vyskytuje se v kolekcích:Konferenční příspěvky / Conference papers (KFI)

Soubory připojené k záznamu:
Soubor VelikostFormát 
43933987 Kaše Classifying Latin Inscriptions of the Roman Empire.pdf702,18 kBAdobe PDFZobrazit/otevřít

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/46904

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

  1. DSpace at University of West Bohemia
  2. Publikační činnost / Publications
  3. OBD