In (KI2V13007), I applied computational methods to linguistic analysis and modeling, combining theory and practice through lectures and Python-based practica, assessed via a final exam and two assignments. Below is a breakdown of the topics covered:
Edit Distance: Learned to measure string similarity for text processing.
N-grams: Studied probabilistic language models and smoothing techniques.
Regular Expressions: Explored pattern matching in text corpora.
Part-of-Speech Tagging: Gain familiarity on tagging with hidden Markov models.
Information Retrieval: Analyzed documentEIA document ranking and retrieval methods.
Speech Synthesis: Investigated techniques for generating spoken language.
Bayesian Statistics: Covered fundamentals for probabilistic modeling.
Classification: Learned naive Bayes and logistic regression for text tasks.
Gaussian Mixture Models: Explored clustering with GMMs and introduced to the EM algorithm.
Automatic Speech Recognition: Studied methods for converting speech to text.
Corpus Analysis: Worked with text and speech datasets.
Assignment 1: Corpus Analysis: Wrote a program for text searching and N-gram modeling, using regex and file I/O.
Assignment 2: Tagging and Classification: Developed a POS tagger and classifier, integrating HMMs and Bayesian methods using NLTK.