The University of Sheffield
Department of Computer Science

Lambros Hadjimichael Undergraduate Dissertation 2005/06

"Web Visualization"

Supervised by Dr ND Lawrence

Abstract

Spectral clustering and multidimensional scaling are mathematical techniques that have been used primarily in thesocial sciences to explain similarities or dissimilarities between objects. PageRank is the technique used by today's most popular search engine, Google.

It this report we will try to see if these techniques or even a combination of other eigendecomposition techniques could be implemented to check whether comparisons between the Department of Computer Science web pages could be made. This would involve taking all the text from each web page and comparing text between web pages. Then the text would be changed to numbers in order to be processed with MATLAB. Also, the PageRank algorithm could be part of the algorithm. Methods that could be of use could be stemming and the use of bigrams.