Multi-Document Text Summarization Using Deep Belief Network

Authors

  • Azal Minshed Abid College of Education, Mustansiriyah University Iraq

DOI:

https://doi.org/10.31695/IJASRE.2022.8.8.7

Keywords:

DBN, PageRank, DUC-2004, ROUGE, Summarization

Abstract

Recently, there is a lot of information available on the Internet, which makes it difficult for users to find what they're looking for. Extractive text summarization methods are designed to reduce the amount of text in a document collection by focusing on the most important information and reducing the redundant information.  Summarizing documents should not affect the main ideas and the meaning of the original text. This paper proposes a new automatic, generic, and extractive multi-document summarizing model aiming at producing a sufficiently informative summary. The idea of the proposed model is based on extracting nine different features from each sentence in the document collection. The extracted features are introduced as input to the Deep Belief Network (DBN) for the classification purpose as either important or unimportant sentences. Only, the important sentences pass to the next phase to construct a graph. The PageRank algorithm is used to assign scores to the graph sentences. The sentences with high scores were selected to create a summary document. The performance of the proposed model was evaluated using the DUC-2004 (Task2) dataset using ROUGE more. The experimental results demonstrate that our proposed model is more effective than the baseline method and some state-of-the-art methods, Where ROUGE-1 reached 0.4032 and ROUGE-2 to 0.1021.

References

Sanchez-Gomez, J. M., Vega-Rodríguez, M. A., & Pérez, C. J. (2021). The impact of term-weighting schemes and similarity measures on extractive multi-document text summarization. Expert Systems with Applications, 169, 114510.

Anand, D., & Wagh, R.(2019). Effective Deep Learning Approaches for Summarization of Legal Texts, Journal of King Saud University - Computer and Information Sciences, doi: https://doi.org/10.1016/ j.jksuci.2019.11.015.

CONGBO, M, WEI EMMA Z., MINGYU, G., & QUAN Z. SHENG.( 2020). Multi-document Summarization via Deep Learning Techniques: A Survey. Vol.1, No. 1 35 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn.

Wafaa S. El-Kassas , Cherif R. Salama , Ahmed A. Rafea, Hoda K. Mohamed ,(2021). Automatic text summarization: A comprehensive survey.Expert Systems with Applications. Vol.165.

Pramita , A., Rustad, S., Fajar Shidik, G., Noersasongko, E., Syukur, A., Affandy, A., & Rosal , D. (2020). Review of Automatic Text Summarization Techniques & Methods. Journal of King Saud University - Computer and Information sciences. doi:10.1016/j.jksuci.2020.05.00610.1016/j.jksuci.2020.05.006.

Tameem, A, Sayyed, A., Nesar, A., Areeba ,A., & Lakshita M.(2020). News Article Summarization: Analysis and Experiments on Basic Extractive Algorithms. International Journal of Grid and Distributed Computing Vol. 13, No. 2, pp. 2366-2379.

Bidoki, M., Moosavi, M. R., & Fakhrahmad, M. (2020). A semantic approach to extractive multi-document summarization: Applying sentence expansion for tuning of conceptual densities. Information Processing & Management, Vol. 57, No.6.

Alguliev, R,M. , Aliguliyev,M. & Isazade,N.,R., (2013), Multiple document summarization based on the evolutionary optimization algorithm,Expert Systems with Applications, vol. 40, no. 5, pp. 1675-1689.

Parveen,D., Hans-Martin R.(2015). Topical Coherence for Graph-based Extractive Summarization. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1949–1954 .

Yasunaga,M., Rui, Z., Kshitijh, M., Ayush, P., Krishnan, S.,& Dragomir, R.(2017). Graph-based Neural Multi-Document Summarization. https://doi.org/10.48550/arXiv.1706.06681.

Alzuhair, A., & Al-Dhelaan, M. (2019). An approach for combining multiple weighting schemes and ranking methods in graph-based multi-document summarization. IEEE Access, 7, 120375-120386.

Singh,K., Manish, G., &Vasudeva, V.(2018). Unity in Diversity: Learning Distributed Heterogeneous Sentence Representation for Extractive Summarization . The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18).

Cho.S, Logan, L., Hassan F., & Fei Liu.( 2019). Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization. In Proceedings of the 57th Conference of the Association for Computational Linguistics (ACL 2019). Florence, Italy, 1027–1038.

Mao, Y., Yanru Qu, Yiqing Xie, Xiang Ren, & Jiawei Han.( 2020). Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). Online, 1737–1751.

Brakel,P., Dieleman,S., & Schrauwen, B.(2012). Training restricted Boltzmann machines with multi-tempering: harnessing parallelization. In M. Verleysen, editor, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), pages 287–292. Evere, Belgium: d-side publications.

Br¨ugge,K., Fischer, A., & C. Igel.(2012). The flip-the-state transition operator for restricted Boltzmann machines. Machine Learning 13, pp. 53-69, 2013. K. Br¨ugge, A. Fischer, and C. Igel. The flip-the-state transition operator for restricted Boltzmann machines. Machine Learning 13, pp. 53-69.

Aurélien, D. & Cyril, F.(2020). Restricted Boltzmann Machine, recent advances and mean-field theory. Chinese Physics B, IOP Publishing, 2020, ff10.1088/1674-1056/abd160ff. ffhal-03143314f.

Mingyang ,J. , Yanchun, L., Xiaoyue, F., Xiaojing, F., Zhili, P. & Xu, R.(2016). Text classification based on deep belief network and softmax regression. Neural Comput & Applic.

Liu T (2010) A novel text classification approach based on deep belief network. In: Proceedings of the 17th international conference on neural information processing, pp 314–321. doi:10.1007/978-3-642-17537-4_39.

Zhou S, Chen Q, Wang X (2014) Active semi-supervised learning method with hybrid deep belief networks. PLoS One 9(9):e107122. doi:10.1371/journal.pone.0107122.

Uçkan, T., & Karcı, A. (2020). Extractive multi-document text summarization based on graph independent sets. Egyptian Informatics Journal. doi:10.1016/j.eij.2019.12.002.

Ferreira, Rafael; de Souza Cabral, Luciano; Lins, Rafael Dueire; Pereira e Silva, Gabriel; Freitas, Fred; Cavalcanti, George D.C.; Lima, Rinaldo; Simske, Steven J.; Favaro, Luciano (2014). Assessing sentence scoring techniques for extractive text summarization. Expert Systems with Applications, Vol.40, No.14,p.5755–5764. doi:10.1016/j.eswa.2013.04.023 .

Qaroush, Aziz; Abu Farha, Ibrahim; Ghanem, Wasel; Washaha, Mahdi; Maali, Eman (2019). An efficient single document Arabic text summarization using a combination of statistical and semantic features. Journal of King Saud University - Computer and Information Sciences, (), S1319157818310498–. doi:10.1016/j.jksuci.2019.03.010.

Suphakit, N., Jatsada, S., Ekkachai, N. & Supachanun, W.(2013). Using of Jaccard Coefficient for Keywords Similarity. Proceedings of the International MultiConference of Engineers and Computer Scientists Vol I, Hong Kong.

Page L, Brin S, Motwani R, Winograd T.(1999). The PageRank citation ranking: Bringing order to the web. Stanford Info Lab.

Nobata, C.; Sekine, S. CRL/NYU summarization system at DUC-2004. In Proceedings of the 2004 Document Understanding Conference, Boston, MA, USA, 6–7 May 2004; NIST: Gaithersburg, MD, USA, 2004.

Lin, Chin-Yew (2004). ROUGE: A Package for Automatic Evaluation of summaries. In Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004).

Baralis, E., Cagliero, L., Mahoto, N., Fiori, A,(2013) "Graphsum: Discovering Correlations Among Multiple Terms for Graphbased Summarization", Journal of Information Sciences, pp. 1-14.

Hiram, C., Pabel, C. & Alexander, G.(2018). On redundancy in multi-document summarization. Journal of Intelligent & Fuzzy Systems. Vol.34, p.3245–3255.

Erkan, G., Radev, D.R., (2004). Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, p.457–479.

Canhasi,E. & Kononenko,I. (2016). Weighted hierarchical archetypal analysis for multi-document summarization, Computer Speech and Language, Vol. 3,P, 24–46.

Al-Saleh,A. & Menai, M. (2018). Solving Multi-Document Summarization as an Orienteering Problem. Algorithms,Vol. 11, NO.7.

Downloads

How to Cite

Azal Minshed Abid. (2022). Multi-Document Text Summarization Using Deep Belief Network. International Journal of Advances in Scientific Research and Engineering (IJASRE), ISSN:2454-8006, DOI: 10.31695/IJASRE, 8(8), 56–65. https://doi.org/10.31695/IJASRE.2022.8.8.7

Issue

Section

Articles