The University of Sheffield
Department of Computer Science

COM4115 Text Processing

Summary This module introduces fundamental concepts and ideas in natural language text processing, covers techniques for handling text corpora, and examines representative systems that require the automated processing of large volumes of text. The course focuses on modern quantitative techniques for text analysis and explores important models for representing and acquiring information from texts. Students should be aware that there are limited places available on this course
Session

Autumn 2021/22

Credits 15
Assessment
  • Assignments [LO1 and LO3]
  • Formal examination [LO2 and LO3]
Lecturer(s) Prof. Rob Gaizauskas, Dr Carolina Scarton & Dr Temitope Adeosun
Resources
Aims
  • to develop an understanding of the fundamentals of how text is represented and processed in a computer;
  • to acquire familiarity with standard computational techniques for handling text corpora;
  • to develop an understanding of the basic problems and principles underlying text processing applications;
  • for one or more topics in text processing to explore refinements beyond the most basic approaches.
Objectives By the end of this unit, a candidate should be able to:
  1. code in a programming language well-suited to text handling
  2. identify and explain key techniques that are relevant to performing a number of text processing tasks
  3. implement systems able to analyse large volumes of textual data, and to perform basic and, in selected cases, more advanced text processing tasks
Content
  • Programming for text processing
  • Text processing topics, such as:
    • Text Encoding and Text Compression
    • Vector-based Representations for Words and Documents
    • Information Retrieval
    • Information Extraction
    • Sentiment analysis
    • Summarisation
Restrictions Not permitted for students who have already taken COM3110
Teaching Method
  • There will be 2 lectures per week, with not more than 20 lectures overall.
  • A third session will be available for lab classes or tutorials some weeks.
Feedback Students can discuss their lab exercise code during lab sessions. They will receive feedback comments on their marked assignment work later on in the term (prior to the exam).
Recommended Reading