Time: Mondays and Wednesdays 15:20–16:35
Location: Bloomberg Center 161
Nathan Kallus
http://www.nathankallus.com
kallus@cornell.edu
Office hours by appointment
http://www.nathankallus.com/5751S18/
This course covers the analysis of data for making decisions with applications to electronic commerce, AI and intelligent agents, business analytics, and personalized medicine. The focus of the class is on how to make sense of data and use it to make better decisions using summarization, visualization, statistical inference, interaction, and supervised and reinforcement learning; on a framework for both conceptually understanding and practically assessing generalization, causality, and decision making using statistical principles and machine learning methods; and on how to effectively design intelligent decision-making systems. Topics include summarizing, visualizing, and comparing data distributions; drawing inferences and generalizing conclusions from data; making inferences about causal effects; A/B testing; instrumental variable analysis; sequential decision making and bandits; Markov decision processes; reinforcement learning; and ethics of data-driven decisions. Students are expected to have working knowledge of calculus, probability, and linear algebra as well as a modern scripting language such as Python or R.
Visualizing distributions
Comparing distributions
Drawing conclusions and making decisions from data
Statistical inference (frequentist)
Classification and regression
Principles of PAC learning
Maximum likelihood estimation and inference thereon
Bootstrap and resampling methods
Bayesian inference
A/B testing
Sequential decision making and bandits
Markov decision processes
Value and policy iteration
Reinforcement learning
TD- and Q-learning
Causal inference form observational data
Instrumental variable analysis
Ethics and fairness in algorithmic decision-making
Linear algebra and calculus at the level of Math 1920 and 2940, probability at the level of ORIE 3500 or ENGRD 2700.
Homework (40%)
In-class prelim (15%)
Take-home prelim (15%)
Final (20%)
Participation (10%)
There is no required textbook for the course. Lecture notes will be distributed with each lecture. Students are encouraged, but not required, and might find it useful to use the following books as additional reading resources:
All of Statistics by Larry Wasserman.
Foundations of Machine Learning by Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar.
Learning from Data by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, Hsuan-Tien Lin.
Reinforcement Learning: An Introduction By Richard S. Sutton and Andrew G. Barto.
An Introduction to Statistical Learning with Applications in R by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani.
Mostly Harmless Econometrics by Joshua D. Angrist and J"orn-Steffen Pischke.