The University of Sheffield
Department of Computer Science

COM6115 Text Processing

Summary This module introduces fundamental concepts and ideas in natural language text processing, covers techniques for handling text corpora, and examines representative systems that require the automated processing of large volumes of text. The course focuses on modern quantitative techniques for text analysis and explores important models for representing and acquiring information from texts. Students should be aware that there are limited places available on this course
Session Autumn 2023/24
Credits 15
Assessment
  • Assignment [LOs 1 & 3]
  • Formal examination [LOs 2 & 3]
Lecturer(s) Prof. Aline Villavicencio & Dr Chenghua Lin
Resources
Aims
  • to develop an understanding of the fundamentals of text processing;
  • to acquire familiarity with techniques for handling text corpora;
  • to develop an understanding of the basic problems and principles underlying text processing applications.
Learning Outcomes  By the end of this unit, a candidate should be able to:
  1. code in a programming language well-suited to text handling
  2. identify and explain key techniques that are relevant to performing a number of text processing tasks
  3. implement systems able to analyse large volumes of textual data, and to perform basic, and in selected cases, more advanced text processing tasks
Content
  • Programming for text processing
  • Linguistic background
  • Text processing topics, such as:
    • Information retrieval
    • Natural language generation
    • Information Extraction
    • Sentiment analysis
Teaching Method
  • There will be 2 lectures per week, with not more than 20 lectures overall.
  • A third session will be available for lab classes or tutorials some weeks.
Feedback Students can discuss their lab exercise code during lab sessions. They will receive feedback comments on their marked assignment work later on in the term (prior to the exam).