Language Processing includes many different tasks, which are both challenging and interesting. Focusing on a single task and writing a paper that addresses this task is not a trivial problem. In this course, we will look at this from a broader perspective, trying to identify problems for particular language datasets. In particular, we will work on the following skills:
- knowledge of various NLP data collections, their further analysis (basics of lexical statistics, preprocessing, identification of the phenomena to look at, etc.)
- familiarity with NLP research papers which describe relevant language problems and propose respective solutions
- analytical ability to pose a research question and propose steps to address it
- use of relevant programming tools and existing libraries (state-of-the-art methods) to be able to perform data analysis, its presentation and visualisation
- familiarity with evaluation methods (PMI, chi-squared test, etc.)
- knowledge of how to properly format a scientific paper / compose it
In the end of the course, each of the students will submit the final project, which consists of two parts:
- paper submission (~ 12-15 pages min.)
- programming code submission (with comments and descriptions).
Projects are evaluated based on clarity of writing/coding, logical consistency and coherence, proper use of technical methods to address relevant NLP problems in the datasets of use.
- The deadline for project submission is 01.04.2020.