The University of Sheffield
Department of Computer Science

COM6115 Text Processing

Summary This module introduces fundamental concepts and ideas in natural language text processing, covers techniques for handling text corpora, and examines representative systems that require the automated processing of large volumes of text. The module focuses on modern quantitative techniques for text analysis and explores important models for representing and acquiring information from texts.
Session Autumn 2024/25
Credits 15
Assessment
  • Assignment [LOs 1 & 3]
  • Formal examination [LOs 2 & 3]
Lecturer(s) Dr Carolina Scarton & Ms Varvara Papazoglou
Resources
Aims
  • to develop an understanding of the fundamentals of text processing;
  • to acquire familiarity with techniques for handling text corpora;
  • to develop an understanding of the basic problems and principles underlying text processing applications.
Learning Outcomes  By the end of this unit, a candidate should be able to:
  1. code in a programming language well-suited to text handling
  2. identify and explain key techniques that are relevant to performing a number of text processing tasks
  3. implement systems able to analyse large volumes of textual data, and to perform basic, and in selected cases, more advanced text processing tasks
Content
  • Programming for text processing
  • Linguistic background
  • Text processing topics, such as:
    • Information retrieval
    • Natural language generation
    • Information Extraction
    • Sentiment analysis
Restrictions  Optional modules within the department have limited capacity. We will always try to accommodate all students but cannot guarantee a place. 
Teaching Method
  • Weekly lectures and lab sessions. 
Feedback Students can discuss their lab exercise code during lab sessions. They will receive feedback comments on their marked assignment work later on in the term (prior to the exam).