Graphs and networks offer a convenient way to study systems around us, including such complex ones as human language. Graph-based representations are proven to be an effective approach for a wide variety of Natural Language Processing (NLP) tasks.
In this course, we will seek answers to three questions: (1) how to express the linguistic phenomena as graphs, (2) how to gain knowledge based on them, and (3) how to assess the quality of this knowledge. We will start with such traditional graph-based NLP and Information Retrieval (IR) methods as TextRank and Markov Clustering, and finish with such contemporary Machine Learning approaches as StarSpace and Graph Convolutional Networks. Since most methods described in this course are unsupervised, special attention is paid to their thorough assessment using both automatic metrics and human judgements, including crowdsourcing.
The course has five lectures on Language Graphs, Graph Clustering, Graph Embeddings, Evaluation, and Crowdsourcing, which elaborately go through the corresponding algorithms step-by-step and suggest useful linguistic datasets. The target audience of this course is advanced graduate students, data analysts, and researchers in NLP and IR (but it is not limited to them).
Lectures are in Russian, but the slides are in English.
Date and time | Class|Name | Venue|short | Materials |
---|---|---|---|
17 April 18:00–19:30 |
Язык и графы, Lecture | Конференция в zoom, Онлайн | |
24 April 18:00–19:30 |
Кластеризация графов, Lecture | Конференция в zoom, Онлайн | |
08 May 18:00–19:30 |
Векторные представления графов, Lecture | Конференция в zoom, Онлайн | |
15 May 18:00–19:30 |
Оценка качества, Lecture | Конференция в zoom, Онлайн | |
22 May 18:00–19:30 |
Краудсорсинг, Lecture | Конференция в zoom, Онлайн |