Szczegóły publikacji
Opis bibliograficzny
Comparison of latent semantic analysis and probabilistic latent semantic analysis for documents clustering / Marcin KUTA, Jacek KITOWSKI // Computing and Informatics / Slovak Academy of Sciences. Institute of Informatics ; ISSN 1335-9150. — Tytuł poprz.: Computers and Artificial Intelligence. — 2014 — vol. 33 no. 3, s. 652–666. — Bibliogr. s. 664–665, Abstr.
Autorzy (2)
Słowa kluczowe
Dane bibliometryczne
ID BaDAP | 91367 |
---|---|
Data dodania do BaDAP | 2015-09-03 |
Rok publikacji | 2014 |
Typ publikacji | artykuł w czasopiśmie |
Otwarty dostęp | |
Czasopismo/seria | Computing and Informatics |
Abstract
In this paper we compare usefulness of statistical techniques of dimensionality reduction for improving clustering of documents in Polish. We start with partitional and agglomerative algorithms applied to Vector Space Model. Then we investigate two transformations: Latent Semantic Analysis and Probabilistic Latent Semantic Analysis. The obtained results showed advantage of Latent Semantic Analysis technique over probabilistic model. We also analyse time and memory consumption aspects of these transformations and present runtime details for IBM BladeCenter HS21 machine.