COM3110 Text Processing
Summary |
This module introduces fundamental concepts and ideas in
natural language text processing, covers techniques for
handling text corpora, and examines representative systems
that require the automated processing of large volumes of
text. The course focuses on modern quantitative techniques
for text analysis and explores important models for
representing and acquiring information from texts. Students should be aware that there are limited places available on this course |
Session |
Autumn 2023/24 |
Credits |
10 |
Assessment |
- Assignments [LO1 and LO3]
- Formal examination [LO2 and LO3]
|
Lecturer(s) |
Prof. Rob Gaizauskas & Dr Carolina Scarton |
Resources |
|
Aims |
The aims of this module are:
- to develop an understanding of the fundamentals of
how text is represented and processed in a computer;
- to acquire familiarity with standard computational techniques for handling
text corpora;
- to develop an understanding of the basic problems and
principles underlying text processing applications.
|
Learning Outcomes |
By the end of this unit, a candidate should be able to:
- code in a programming language well-suited to text
handling
- identify and explain key techniques that are relevant
to performing a number of text processing tasks
- implement systems able to analyse large volumes of
textual data, and to perform basic text processing tasks
|
Content |
- Programming for text processing
- Text processing topics, such as:
- Text Encoding and Text Compression
- Vector-based Representations for Words and Documents
- Information Retrieval
- Information Extraction
- Sentiment analysis
- Summarisation
|
Teaching Method |
- There will be 2 lectures per week, with not more than
20 lectures overall.
- A third session will be available for lab
classes or tutorials some weeks.
|
Feedback |
Students can discuss their lab exercise code during lab
sessions. They will receive feedback comments on their
marked assignment work later on in the term (prior to the
exam). |
|