CMPUT 676: Optimization and Decision-Making under Uncertainty

Fall 2022

Instructor: Xiaoqi Tan ($ \textsf{xiaoqi.tan@ualberta.ca} $)
Location & Time: CAB 369, MW 2:00pm – 3:20pm
Office hour: After class or by appointment
Slack workplace: Join the ODMU@UofA slack workspace with your $\textsf{@ualberta.ca}$ email for online discussions about course-related questions, or topics in the general field of Optimization & Decision-Making under Uncertainty.


Course Overview

Many real-world problems involve making decisions, over a period of time, in the presence of different forms of uncertainty. These challenges arise in Internet advertising, energy sustainability, transportation, financial trading, healthcare, and a wide range of problems in artificial intelligence and machine learning. In different application scenarios, these specific decision problems might look different at first glance; however, the models and algorithms needed to address them are often similar.

In this research-oriented course, we will review recent developments and discuss open directions in the general field of decision-making under uncertainty via several modern optimization lenses. We will start by giving a brief introduction to convex optimization. Topics in this part include basic concepts of convex sets and convex functions, canonical convex problems, Lagrange multipliers, duality theory, KKT optimality conditions, and standard convex optimization algorithms. After that, we will discuss models and algorithms to handle different forms of uncertainty in optimization and decision-making. Major topics to be covered include: i) Online algorithms and Online optimization; ii) Stochastic optimization, learning, and approximation; and iii) Algorithmic game theory + Mechanism design. These topics are highly interdisciplinary – they have strong ties to various disciplines such as theoretical computer science, artificial intelligence, machine learning, economics, operations research, statistics, and control. The course is theoretical in nature, but most problems considered will be motivated and illustrated by practical examples.

Course Objectives

  1. Understand the basics of optimization theory, algorithms, and applications.

  2. Understand how to build rigorous mathematical models and develop efficient algorithms, with provable guarantees, for optimization and decision-making under different forms of uncertainty.

  3. Be well prepared to conduct research in areas such as i) online optimization, learning, and decision-making; ii) stochastic optimization, learning, and approximation; and iii) algorithmic game theory, mechanism design, and auctions.

Course Policies

Textbook: This course does not require any textbook. There will be references (e.g., lecture slides, notes, papers, and/or book chapters) suggested for each lecture.

Prerequisites: You should know material in standard UG courses in calculus, linear algebra, probability, and algorithms very well. Having some optimization background will be a bonus.

Grading: Participation (10%); Assignment (20%); Project: Proposal (20%) + Presentation (20%) + Report (30%).

Assignment: The assignment includes two parts. Part I consists of questions that everyone must answer with rigorous proofs (no programming required). In Part II, there are some open-ended questions. You are free to pick any one from the list. Solving these questions will need some mathematical derivations, and you may also find it necessary to write some code to demonstrate some numerical results. There is no designated programming language, but you are recommended to use Python if you can. The submission of your solution includes a written report and a link to the code (e.g., a repository in GitHub).

Project: You are expected to complete a research-flavored project which includes a proposal, an in-class presentation, and a written report.

For each evaluation criteria, a score will be assigned between 1 and 5 (i.e., 1: very low, 2: low, 3: medium, 4: high, 5: very high). When preparing your proposal and report, you are highly recommended to use Latex (template here).

Course Schedule

Date Topic References
Sept. 5 No Class (Labour Day)
Sept. 7 Lec 1: Overview
Course topics; Logistics
L1-slides
Convex Optimization: A Brief Introduction
Sept. 12 Lec 2: Concepts
Convex sets
Convex functions
Convex problems
L2-slides
BV Book: Ch1-Ch4
Sept. 14 Lec 3: Theory
Lagrange multipliers
Duality theory
Optimality conditions
L3-slides
BV Book: Ch5
DB Book
Sept. 19 Lec 4: Algorithms
Gradient descent
Newton’s methods
Barrier methods
Dual ascent
L4-slides, notes
BV Book: Ch9-Ch11
Online Algorithms; Online Optimization
Sept. 21 Lec 5: Online Algorithms
Ski rental problem
Deterministic vs Randomized
Yao’s principle
L5-slides
Reading: Paper 1, Paper 2
Extra reading: Paper 3
Sept. 26 Lec 6: Online Primal-Dual (OPD)
Time series search
$k$-max and $k$-min search
One-way trading
The OPD approach (1st look)
L6-slides, notes
Reading: Paper 1, Paper 2
Extra reading: Paper 3, Paper 4
Sept. 28 Lec 7: Online Knapsack
Threshold-based algorithms
The OPD approach (2nd look)
L7-slides, notes
Reading: Paper 1, Paper 2
Extra reading: Paper 3, Paper 4
More related papers
Oct. 3 Lec 8: Online Matching
RANKING algorithm
The OPD approach (3rd look)
L8-slides
AM Survey, OPD Survey
Reading: Paper 1, Paper 2, Paper 3
Extra reading: Paper 4
Oct. 5 Lec 9: Online Convex Optimization (OCO)
The Experts problem
OCO: intro + algorithms + variants
Competitive ratio vs. Regret (1st look)
L9-slides
EH Book, CBL Book, SSS Survey
SOCO reading: Paper 1, Paper 2, Paper 3
CCB reading: Paper 4, Paper 5
Oct. 10 No Class (Thanksgiving Day)
Oct. 12 Lec 10: Online Linear Programming (OLP)
OLP: intro + variants
Adversarial inputs vs. Random inputs
Competitive ratio vs. Regret (2nd look)
L10-slides
OLP reading: Paper 1, Paper 2
ROM reading: Paper 3, Paper 4
Oct. 17 Lec 11: Beyond Worst-Case Analysis (WCA)
Beyond WCA: why + how
Learning-augmented algorithms
Data-driven algorithms
L11-slides
Beyond WCA: Paper 1, Paper 2, Paper 3
Reading: Paper 4, Paper 5
Stochastic Optimization and Learning
Oct. 19 Lec 12: Markov Decision Process (MDP)
MDPs: A brief intro
Algorithms: Value/policy iteration; LP approach
Example: energy storage control
L12-slides
Oct. 24 Lec 13: Reinforcement Learning (RL)
RL: A brieft intro
Algorithms: Q-learning; Policy gradients; Actor-Critic
L13-slides
Courses: CMPUT 609, CMPUT 653
Oct. 26 Lec 14: Multi-Armed Bandits (MABs)
MABs: intro + variants
Stochatic bandits; Algorithms
L14-slides
LS Book, BCB Book, Slivkins Book
Reading: Papers on pp. 42
Oct. 31 Lec 15: Stochastic Approximation
The Robbins-Monro algorithm
Stochastic gradient descent (SGD); Adaptive SGD
Summary: Online vs. Stochastic
L15-slides
Reading: Paper 1, Paper 2, Paper 3
Algorithmic Game Theory; Mechanism Design
Nov. 2 Lec 16: Game Theory
Pure-strategy Nash equilibrium
Mixed-strategy Nash equilibrium
L16-slides
Nov. 7 No Class (Reading Week Break)
Nov. 9 No Class (Reading Week Break)
Nov. 11 Proposal Due (eClass link)
Nov. 14 Lec 17: Mechanism Design (MD)
Key concepts of MD
Auctions; Vickrey-Clarke-Groves mechanisms
Optimal auctions
L17-slides
MD reading: Econ, CS
Auctions reading: Survey
Reading: Paper 1, Paper 2
Nov. 16 Lec 18: Online Mechanism Design (OMD)
OMD: intro
Online auctions (OA)
Posted price mechanisms (PPM)
Prophet inequalities (PI)
Course summary
L18-slides
OMD reading: Paper 1, Paper 2
OA reading: Paper 3, Paper 4
PI reading: Paper 5, Paper 6, More
Assignment – Part I (pdf, tex)
Assignment – Part II (pdf, tex)
Project - Presentations
Nov. 21 Wenkai Gao
Neural Contextual Bandits for Recommendation Systems

Shuwei Wang
Online Learning of Formula-based Search Heuristics

Gábor Mihucz
Bayesian vs. Minimax Regret in Optimal Decision-Making
WG-slides


SW-slides


GM-slides
Nov. 23 Leixing Jiang
Value of Historical Information for Newsvendors

Yuxin Liu
Online Linear MDPs

Bryan Chan
Dynamic Regret of Linear MDPs

Yanqing Wu
Offline RL with Function Approximators
LJ-slides


YL-slides


BC-slides


YW-slides
Nov. 28 Yanzhao Wang
Online Selection Problems

Tales Henrique Carvalho
Online Mechanism Design for Cloud Computing

Yu Wang
One-Way Trading with Concave Returns

Qianxi Li
Mirror Descent Policy Optimization
YW-slides


THC-slides


YW-slides


QL-slides
Nov. 30 Benyamin Ghaseminia
Chimp Optimization Algorithms

Shang Wang
Multi-Task Gradient Descent

Amir Bahmani
Hyperparameter Optimization in RL

Diego Gomez
Regret Analysis in RL with Function Approximation
BG-slides


SW-slides


AB-slides


DG-slides
Dec. 5 Tian Xiang Du
Medical Image Classification using Learned Uncertainty

Chen Ma
Policy Gradient Theorem for Options via Fenchel Duality

Yuqiao Wen
Stochastic Discrete Optimization over the Sentence Space

Hamza Mustafa Alvi
Learning Mechanism for Competing Platforms using RL
TXD-slides


CM-slides


YW-slides


HMA-slides
Dec. 7 No Class Project Due (Final report)
Dec. 10 Assignment Due (Part I & Part II)