18 اردیبهشت 1403
مهدي حسين زاده اقدم

مهدی حسین زاده اقدم

مرتبه علمی: دانشیار
نشانی: ایران / آذربایجان شرقی / بناب / بزرگراه ولایت
تحصیلات: دکترای تخصصی / مهندسی کامپیوتر-هوش مصنوعی
تلفن: 041-37741636
دانشکده: دانشکده فنی و مهندسی
گروه: گروه مهندسی کامپیوتر

مشخصات پژوهش

عنوان
A novel regularized asymmetric non-negative matrix factorization for text clustering
نوع پژوهش مقاله چاپ شده
کلیدواژه‌ها
text clustering, semantic features, non-negative matrix factorization, multiplicative updating rules
پژوهشگران مهدی حسین زاده اقدم (نفر اول)، محمد دریایی زنجانی (نفر دوم)

چکیده

Non-negative matrix factorization (NMF) is a dimension reduction method that extracts semantic features from high-dimensional data. Most of the developed optimization methods for NMF only pay attention to how each feature vector of factorized matrices should be modeled, and ignore the relationships among feature vectors. Such a relationship among documents' feature vectors provides better factorization for text clustering. This paper proposes a novel regularized asymmetric non-negative matrix factorization (RANMF) for text clustering. The proposed method puts regularized constraints on pairwise feature vectors by applying penalties using distance-based measures. We design a new cost function based on the Kullback-Leibler divergence and develop an optimization scheme to solve the cost function by suggesting novel multiplicative updating rules. The proposed method considers the documents from the same cluster closely together in the new representation space. Hence, the acquired parts-based representation has consistent cluster labeling with the original space and has a more discriminating ability. The complexity analysis showed that RANMF does not increase time cost by applying regularizers when comparing with the original NMF. Regarding experiments, the proposed RANMF converges very fast because it terminates in less than ten iterations. The complete proof of convergence and experimental results on the benchmark data sets demonstrate that the proposed multiplicative updating rules converge fast and achieve superior results compared to other algorithms.