Murtadha Sahbudin MSc Dissertation 2015/16
Automatic Plagiarism Detection against Large Text Collection
Supervised by M.Hepple
Abstract
Plagiarism is a major issue in numbers area of profession, especially in academia and research mainly. External plagiarism detection method is the approach which a set of suspected document compared against an external source. In this project, we aim to make use of the IR-Based method for candidates source selection and pre -processing methods. Next, we performed in-depth text analysis detection within the suspicious against candidates source document. In achieving this we investigate with pre- processing methods of Rapid Keywords Extraction and trigrams collocation, implemented IR-Bases using Lucene for retrieval of source document candidates, and finally implemented a method for text alignment detection within passage using Jaccard Coefficient score. The evaluation and corpus were based on the PAN PC 2011 standards.
|