Statistical Models and Methods for Business Analytics

IDS 575 (Spring 2019)

Document version: Jan 27 2019

All announcements will be made on the Forum

Overview

The goal of this class is to cover the foundations of modern statistics and machine learning complementing the data mining focus of IDS 572. In other words, you will get up to speed with the requisite background as well as the key theoretical underpinnings of modern analytics. We will do so through the lens of statistical machine learning.

Previous Editions

Logistics

Textbook and Materials

Software

Schedule (tentative)

01/14 : Supervised Learning: Linear Models and Least Squares, k-Nearest Neighbor Methods

01/28 : Towards Regression: Statistical Decision Theory, Curse of Dimensionality, Linear Regression, Categorical Variables, Interaction Terms

02/04 : Regression I: Bias-variance Trade-off, Subset Selection, Cross-Validation

02/11 : Regression II: Ridge Regression, LASSO (Least Absolute Shrinkage and Selection Operator)

02/18 : Classification: Linear Discriminant Analysis, Logistic Regression, Model Assessment and Selection: AIC, BIC and Validation

02/25 : The Bootstrap, Maximum Likelihood Estimation and Review of Linear Models

03/11 : Expectation Maximization and Sampling (Markov Chain Monte Carlo)

03/18 : Applications of regression, classification and likelihood maximization

04/01 : Tree Methods, Adaboost and Gradient Boosting

04/08 : Random Forests, Multivariate Adaptive Regression Splines and Support Vector Machines

04/15 : Kernel Trick, Introduction to Unsupervised Learning, Association Rules

04/22 : Unsupervised Learning: Clustering, Principal Component Analysis and Spectral Clustering

04/29 : Time Series and Supervised Learning, and the ARMA Model

Assignments

  1. 01/28: Assignment 1 out. Due on 02/10
  2. 02/11: Assignment 2 out. Due on 02/24
  3. 02/25: Assignment 3 out. Due on 03/17
  4. 04/01: Assignment 4 out. Due on 04/14
  5. 04/15: Assignment 5 out. Due on 04/28

These involve reimplementing statistical techniques and understanding their behavior on interesting datasets. Always mention sources in your assignment solutions. Submission deadline is BEFORE 11.59 PM on the concerned day. Late submissions will have an automatic 20% penalty per day. Use Blackboard for uploads.

Exams

  1. 03/04: Exam I (same venue as lectures, and during class hours)
  2. 05/06: Exam II (same venue as lectures, and during class hours)

These are closed book, but one 8.5x11-inch handwritten cheatsheet is allowed. No computers and communication devices are allowed.

Grades

Miscellaneous Information