18 اردیبهشت 1403
مهدي حسين زاده اقدم

مهدی حسین زاده اقدم

مرتبه علمی: دانشیار
نشانی: ایران / آذربایجان شرقی / بناب / بزرگراه ولایت
تحصیلات: دکترای تخصصی / مهندسی کامپیوتر-هوش مصنوعی
تلفن: 041-37741636
دانشکده: دانشکده فنی و مهندسی
گروه: گروه مهندسی کامپیوتر

مشخصات پژوهش

عنوان
Automatic Extractive and Generic Document Summarization Based on NMF
نوع پژوهش مقاله چاپ شده
کلیدواژه‌ها
Text Summarization, Latent Topics, Non-negative Matrix Factorization, Updating Rules
پژوهشگران مهدی حسین زاده اقدم (نفر اول)

چکیده

Nowadays, textual information grows exponentially on the Internet. Text summarization (TS) plays a crucial role in the massive amount of textual content. Manual TS is time-consuming and impractical in some applications with a huge amount of textual information. Automatic text summarization (ATS) is an essential technology to overcome mentioned challenges. Non-negative matrix factorization (NMF) is a useful tool for extracting semantic contents from textual data. Existing NMF approaches only focus on how factorized matrices should be modeled, and neglect the relationships among sentences. These relationships provide better factorization for TS. This paper suggests a novel non-negative matrix factorization for text summarization (NMFTS). The proposed ATS model puts regularizes on pairwise sentences vectors. A new cost function based on the Frobenius norm is designed, and an algorithm is developed to minimize this function by proposing iterative updating rules. The proposed NMFTS extracts semantic content by reducing the size of documents and mapping the same sentences closely together in the latent topic space. Compared with the basic NMF, the convergence time of the proposed method does not grow. The convergence proof of the NMFTS and empirical results on the benchmark data sets show that the suggested updating rules converge fast and achieve superior results compared to other methods.