Please use this identifier to cite or link to this item:
http://localhost:8080/xmlui/handle/123456789/9998| Title: | Combining link and content analysis for text clustering |
| Authors: | Ferdjouni, Zineddine Chikhi, Nacim Fateh ( Encadreur) |
| Keywords: | Clustering (unsupervised classification) Text mining Bibliometrics Data mining Cluster analysis Multi-view NMF (MNFM) |
| Issue Date: | 2013 |
| Publisher: | Université Blida 1 |
| Abstract: | In many applications huge amounts of textual data are generated continuously. The web is a typical example in which hundreds of thousands (if not millions) of articles are published every day. In order to facilitate the access to such huge document collections, researchers have developed various tools to organise them. Document clustering is one of these techniques which has recently become a very active area of research. Many document clustering algorithms have been developed such as PLSA (Probabilistic Latent Semantic Analysis) and NMF (Non-negative Matrix Factorization). These approaches however use only the textual content of documents and do not exploit other information such as the links between documents. In this work we propose a new algorithm, the Multi-view Non-negative Matrix Factorization (MNMF), which is a hybrid algorithm for document clustering, MNMF takes into account not only the textual content of documents but also the link information. We show through experiments using real document collections the validity of the proposed approach. Keywords: Clustering (unsupervised classification), Text mining, Bibliometrics, Data mining, Cluster analysis, Multi-view NMF (MNFM). |
| Description: | ill., Bibliogr. Cote:ma-004-132 |
| URI: | http://di.univ-blida.dz:8080/jspui/handle/123456789/9998 |
| Appears in Collections: | Mémoires de Master |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| doc20210214220139.pdf | 26,84 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.