Please use this identifier to cite or link to this item: http://localhost:8080/xmlui/handle/123456789/9998
Title: Combining link and content analysis for text clustering
Authors: Ferdjouni, Zineddine
Chikhi, Nacim Fateh ( Encadreur)
Keywords: Clustering (unsupervised classification)
Text mining
Bibliometrics
Data mining
Cluster analysis
Multi-view NMF (MNFM)
Issue Date: 2013
Publisher: Université Blida 1
Abstract: In many applications huge amounts of textual data are generated continuously. The web is a typical example in which hundreds of thousands (if not millions) of articles are published every day. In order to facilitate the access to such huge document collections, researchers have developed various tools to organise them. Document clustering is one of these techniques which has recently become a very active area of research. Many document clustering algorithms have been developed such as PLSA (Probabilistic Latent Semantic Analysis) and NMF (Non-negative Matrix Factorization). These approaches however use only the textual content of documents and do not exploit other information such as the links between documents. In this work we propose a new algorithm, the Multi-view Non-negative Matrix Factorization (MNMF), which is a hybrid algorithm for document clustering, MNMF takes into account not only the textual content of documents but also the link information. We show through experiments using real document collections the validity of the proposed approach. Keywords: Clustering (unsupervised classification), Text mining, Bibliometrics, Data mining, Cluster analysis, Multi-view NMF (MNFM).
Description: ill., Bibliogr. Cote:ma-004-132
URI: http://di.univ-blida.dz:8080/jspui/handle/123456789/9998
Appears in Collections:Mémoires de Master

Files in This Item:
File Description SizeFormat 
doc20210214220139.pdf26,84 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.