A Comparative Analysis of Text Similarity Measures and Algorithms in Research Paper Recommender Systems
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Conference on Information Communications Technology and Society
Abstract
The increase in the number of online published research papers can be attributed to the recent developments of the internet and web technologies. However, researchers and online users have a difficult time getting relevant and accurate information due to information explosion on the internet. In this paper, we seek to establish which algorithms and similarity metric combinations can be used to optimise the search and recommendation of articles in a research paper recommender systems. Our investigation utilised non-linear classification algorithms with text similarity measures. An offline evaluation approach is utilised to determine the models accuracy and performance, while various similarity metrics are assessed using available datasets. We will utilise the Recursive PARTitioning (rpart), Random Forest and Boosted machine learning algorithms on research paper similarity evaluation datasets. The rpart algorithm generally performed well when compared to the Boosted and the Random Forest algorithms by getting an average accuracy and time efficiency of 80.73 and 2.354628 seconds respectively. The cosine similarity performed best when compared with the other similarity metrics. New similarity metrics and measures are going to be proposed. It has been established in this
paper that there are better combinations of metrics and algorithms when attempting to develop models that can be used for research paper similarity evaluation and recommendation. Further challenges and open issues are identified.
Description
journal article
Citation
Maake, B., Ojo, S. O., & Zuva, T. (2018). A Comparative Analysis of Text Similarity Measures and Algorithms in Research Paper Recommender Systems. Conference on Information Communications Technology and Society
