In (201900027), I’m developing practical skills in data analysis and visualization, focusing on real-world applications using R. This course emphasizes hands-on experience with statistical methods, machine learning, and visualization techniques, assessed through weekly homeworks, a group assignment, and a digital exam. Below is a breakdown of the topics covered:
Exploratory Data Analysis (EDA): Learned to explore datasets using summary statistics and visualizations to uncover patterns.
Supervised Learning: Mastered techniques like linear regression, logistic regression, and K-nearest neighbors for predictive modeling.
Model Evaluation: Studied model fit, cross-validation, and error metrics (e.g., mean squared error) to assess performance.
Linear Regression with Big Data: Applied subset selection and shrinkage methods (ridge regression, lasso) to handle high-dimensional datasets.
Tree-Based Methods: Explored decision trees and random forests for regression and classification tasks.
Text Mining: Learned preprocessing, sentiment analysis, and frequency analysis (e.g., TF-IDF) for text data.
Network Analysis: Studied network representations, centrality measures, and community detection using igraph.
Grammar of Graphics: Mastered ggplot for creating effective visualizations, including scatter plots, density plots, and labeled graphs.
Interactive Visualizations: Built RShiny apps for dynamic, user-driven data dashboards, allowing real-time data interaction.
Weekly Homeworks: Completed R-based exercises on EDA, model fitting, and visualization, graded pass/fail.
Group Assignment (Part 1): Conducted linear regression with subset selection or shrinkage methods on a dataset, creating visualizations to summarize findings.
Group Assignment (Full): Developed an RShiny app to visualize and analyze a chosen dataset, integrating supervised learning and interactive visualizations.