Models for Language Processing (KI3V21001)

Completed: 26-06-2023 | 7.5 EC | Universiteit Utrecht

What I Learned

In (KI3V21001), I explored computational models for natural language processing, focusing on syntax, semantics, and reasoning, assessed via four assignments and a final exam. Below is a breakdown of the topics covered:

Syntactic Parsing and Semantics

Dependency Parsing: Studied dependency grammars and Universal Dependencies.

CCG Parsing: Explored Combinatory Categorial Grammar for syntactic analysis.

Compositional Semantics: Learned to derive meaning from syntactic structures.

Logic and Inference in NLP

Natural Logic & Tableau: Applied tableau methods for natural language inference.

Word Senses & WordNet: Investigated lexical semantics using WordNet.

Natural Language Inference (NLI): Analyzed entailment and related tasks.

Machine Translation and Diversity

Machine Translation: Covered statistical and neural approaches.

Language Diversity: Explored typological variations across languages.

Pretrained Language Models: Introduced to LMs like BERT and their embeddings.

Practical Application

Assignment 1: Parsing: Implemented dependency and constituency parsing with spaCy and CoreNLP, analyzing projectivity and PP-attachment.

Assignment 2: Meaning: Developed WSD systems (Most Frequent Sense, Simple Lesk, Vector-based Lesk) and lexical relation predictors using WordNet and Prolog-based reasoning.

Assignment 3: Translation: Analyzed variation across languages and implemented Byte Pair Encoding for subword tokenization.

Assignment 4: Regression: Trained linear and logistic regression models on GloVe embeddings for concreteness and hypernymy prediction, plus fine-tuned DistilBERT for entailment.