Název: | Unsupervised methods for language modeling: technical report no. DCSE/TR-2012-03 |
Autoři: | Brychcín, Tomáš |
Datum vydání: | 2012 |
Nakladatel: | University of West Bohemia in Pilsen |
Typ dokumentu: | zpráva report |
URI: | http:// www.kiv.zcu.cz/publications/ http://hdl.handle.net/11025/21549 |
Klíčová slova: | jazykový model;n-gram |
Klíčová slova v dalším jazyce: | language model;n-gram |
Abstrakt v dalším jazyce: | Language models are crucial for many tasks in NLP and N-grams are the best way to build them. Huge e ort is being invested in improving n-gram language models. By introducing external information (morphology, syntax, partitioning into documents, etc.) into the models a signi cant improvement can be achieved. The models can however be improved with no external information and smoothing is an excellent example of such an improvement. Thesis summarizes the state-of-the-art approaches to unsupervised language modeling with emphases on the in ectional languages, which are particularly hard to model. It is focused on methods that can discover hidden patterns that are already in a training corpora. These patterns can be very useful for enhancing the performance of language modeling, moreover they do not require additional information sources. |
Práva: | © University of West Bohemia in Pilsen |
Vyskytuje se v kolekcích: | Zprávy / Reports (KIV) |
Soubory připojené k záznamu:
Soubor | Popis | Velikost | Formát | |
---|---|---|---|---|
Brychcin.pdf | Plný text | 425,44 kB | Adobe PDF | Zobrazit/otevřít |
Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam:
http://hdl.handle.net/11025/21549
Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.