Název: | Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach |
Autoři: | Ehrmann, Maud Kaše, Vojtěch Karsdorp, Folgert Heřmánková, Petra Wevers, Melvin Sobotková, Adéla Andrews, Tara Lee Burghardt, Manuel Kestemont, Mike Manjavacas, Enrique Piotrowski, Michael van Zundert, Joris |
Citace zdrojového dokumentu: | KAŠE, V. HEŘMÁNKOVÁ, P. SOBOTKOVÁ, A. Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach. In Ehrmann, M., Karsdorp, F., Wevers, M. Proceedings of the Conference on Computational Humanities Research 2021. Amsterdam: CEUR-WS, 2021. s. 123-135. ISBN: neuvedeno , ISSN: 1613-0073 |
Datum vydání: | 2021 |
Nakladatel: | CEUR-WS |
Typ dokumentu: | konferenční příspěvek ConferenceObject |
URI: | http://hdl.handle.net/11025/46904 |
ISBN: | neuvedeno |
ISSN: | 1613-0073 |
Klíčová slova v dalším jazyce: | Latin inscriptions;document classification;comparative analysis;Roman Empire |
Abstrakt v dalším jazyce: | Large-scale synthetic research in ancient history is often hindered by the incompatibility of tax- onomies used by different digital datasets. Using the example of enriching the Latin Inscriptions from the Roman Empire dataset (LIRE), we demonstrate that machine-learning classification mod- els can bridge the gap between two distinct classification systems and make comparative study possible. We report on training, testing and application of a machine learning classification model using inscription categories from the Epigraphic Database Heidelberg (EDH) to label inscriptions from the Epigraphic Database Claus-Slaby (EDCS). The model is trained on a labeled set of records included in both sources (N=46,171). Several different classification algorithms and parametriza- tions are explored. The final model is based on Extremely Randomized Trees algorithm (ET) and employs 10,055 features, based on several attributes. The final model classifies two thirds of a test dataset with 98% accuracy and 85% of it with 95% accuracy. After model selection and evaluation, we apply the model on inscriptions covered exclusively by EDCS (N=83,482) in an attempt to adopt one consistent system of classification for all records within the LIRE dataset. |
Práva: | © authors |
Vyskytuje se v kolekcích: | Konferenční příspěvky / Conference papers (KFI) OBD |
Soubory připojené k záznamu:
Soubor | Velikost | Formát | |
---|---|---|---|
43933987 Kaše Classifying Latin Inscriptions of the Roman Empire.pdf | 702,18 kB | Adobe PDF | Zobrazit/otevřít |
Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam:
http://hdl.handle.net/11025/46904
Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.