Efficient Hash Function for Duplicate Elimination in Dictionaries

Skala, Václav; Hrádek, Jan

Full metadata record

DC pole	Hodnota	Jazyk
dc.contributor.author	Skala, Václav
dc.contributor.author	Hrádek, Jan
dc.date.accessioned	2014-12-18T09:26:45Z
dc.date.available	2014-12-18T09:26:45Z
dc.date.issued	2009
dc.identifier.citation	Algoritmy 2009: 18th Conference on Scientific Computing, p. 382-391.	en
dc.identifier.isbn	978-80-227-3032-7
dc.identifier.uri	http://hdl.handle.net/11025/11785
dc.description.abstract	Fast elimination of duplicate data is needed in many areas, especially in the textual data context. A solution to this problem was recently found for geometrical data using a hash function to speed up the process. The usage of the hash function is extremely efficient when incremental elimination is required especially for processing large data sets. In this paper a new construction of the hash function is presented, giving short clusters with few collisions only. The proposed hash function is not a perfect hash function, nevertheless it gives similar properties to it. The hash function used takes advantage of the relatively large amount of available memory on modern computers, and works well with large data sets. Experiments have proved that different approaches should be used for different types of languages, because the structures of Slavonic and Anglo-Saxon languages are different. Therefore, tests were made with a Czech dictionary having 2.5 million words and an English dictionary having 130 thousands words. Algorithm was also tested for a few other languages. Experimental results are presented in this paper as well.	en
dc.format	10 s.	cs
dc.format.mimetype	application/pdf
dc.language.iso	en	en
dc.publisher	Slovenská technická univerzita v Bratislavě	cs
dc.relation.ispartofseries	Algoritmy 2009	en
dc.rights	Plný text není přístupný.	cs
dc.subject	hešovací funkce	cs
dc.subject	hešovací tabulka	cs
dc.subject	struktura dat	cs
dc.title	Efficient Hash Function for Duplicate Elimination in Dictionaries	en
dc.type	preprint	cs
dc.type	preprint	en
dc.rights.access	closedAccess	en
dc.type.version	draft	en
dc.subject.translated	hash function	en
dc.subject.translated	hash table	en
dc.subject.translated	data structure	en
dc.type.status	Peer-reviewed	en
Vyskytuje se v kolekcích:	Preprinty / Preprints (KIV)

Soubory připojené k záznamu:

Soubor	Popis	Velikost	Formát
Skala_2009_HASH_Dictionary-Algoritmy.pdf	Plný text	311,07 kB	Adobe PDF	Zobrazit/otevřít Vyžádat kopii

Zobrazit minimální záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/11785

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace