**Time**: Mondays and Wednesdays 15:20–16:35

**Location**: Bloomberg Center 161

Nathan Kallus

http://www.nathankallus.com

kallus@cornell.edu

Office hours by appointment

http://www.nathankallus.com/5751S18/

This course covers the analysis of data for making decisions with applications to electronic commerce, AI and intelligent agents, business analytics, and personalized medicine. The focus of the class is on how to make sense of data and use it to make better decisions using summarization, visualization, statistical inference, interaction, and supervised and reinforcement learning; on a framework for both conceptually understanding and practically assessing generalization, causality, and decision making using statistical principles and machine learning methods; and on how to effectively design intelligent decision-making systems. Topics include summarizing, visualizing, and comparing data distributions; drawing inferences and generalizing conclusions from data; making inferences about causal effects; A/B testing; instrumental variable analysis; sequential decision making and bandits; Markov decision processes; reinforcement learning; and ethics of data-driven decisions. Students are expected to have working knowledge of calculus, probability, and linear algebra as well as a modern scripting language such as Python or *R*.

Visualizing distributions

Comparing distributions

Drawing conclusions and making decisions from data

Statistical inference (frequentist)

Classification and regression

Principles of PAC learning

Maximum likelihood estimation and inference thereon

Bootstrap and resampling methods

Bayesian inference

A/B testing

Sequential decision making and bandits

Markov decision processes

Value and policy iteration

Reinforcement learning

TD- and Q-learning

Causal inference form observational data

Instrumental variable analysis

Ethics and fairness in algorithmic decision-making

Linear algebra and calculus at the level of Math 1920 and 2940, probability at the level of ORIE 3500 or ENGRD 2700.

Homework (40%)

In-class prelim (15%)

Take-home prelim (15%)

Final (20%)

Participation (10%)

There is *no* required textbook for the course. Lecture notes will be distributed with each lecture. Students are encouraged, but not required, and might find it useful to use the following books as additional reading resources:

*All of Statistics*by Larry Wasserman.*Foundations of Machine Learning*by Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar.*Learning from Data*by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, Hsuan-Tien Lin.*Reinforcement Learning: An Introduction*By Richard S. Sutton and Andrew G. Barto.*An Introduction to Statistical Learning with Applications in R*by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani.*Mostly Harmless Econometrics*by Joshua D. Angrist and J"orn-Steffen Pischke.