# INTELLIGENT AGENTS & DECISION (CS_533_001_S2020)

## INTELLIGENT AGENTS & DECISION (CS_533_001_S2020)

**Lecture Schedule and Slides**

**Instructor: **Alan Fern

**TA: **Aashish Adhikari

**Supplementary Online Textbook**

Reinforcement Learning: An Introduction

Richard Sutton and Andrew G. Barto

Second Edition, in progress

MIT Press

**Note on Textbook:** The authors very generously make the book freely available online, but you can also purchase a hard copy at the above site. It is definitely a book worth having if you can afford the purchase. The course lecture slides from the instructor are relatively self-contained, but the supplementary textbook offers many valuable perspectives and examples. Note that the mathematical notation in the book and the course slides will not always be consistent.

####
**Remote Instruction Hours**

(Zoom Link -- https://oregonstate.zoom.us/j/557481605)

(Zoom Link -- https://oregonstate.zoom.us/j/557481605)

Monday & Wednesday

Lecture 10:00am-11:20am

Instructor Office Hours 11:20am-noon

####
**TA Remote Office Hours**

(Zoom Link -- https://oregonstate.zoom.us/j/331849022)

(Zoom Link -- https://oregonstate.zoom.us/j/331849022)

Tuesday, Thursday 4-5

**Description**

In this course we will study models and algorithms for automated planning and decision making. The course will be divided into four main sections.

1) We will study planning in the context of Markov decision processes (MDPs) where the environment is allowed to be stochastic. We will cover the basic theory and algorithms for explicit state-space MDPs for exactly solving small to moderately sized problems.

2) We will study the basic theory and algorithms for reinforcement learning (RL), where the agent is not given a model of the environment, but instead must learn to act in the world by directly interacting with the environment. We will learn about model-based approaches and the two primary model-free RL paradigms, temporal-difference learning and policy gradient methods. The course will study how the paradigms can be applied to learn both linear and non-linear agent architectures (including what is now known as Deep RL).

3) We will study the area of Monte-Carlo planning, which is a middle ground between reinforcement learning and MDP planning, where a simulator of the system to be controlled is available and can be used to make intelligent action choices.

**Implementation Assignments**

There will be a number of assignments, which will involve some amount of implementation of algorithms and experimentation.

We will use Intel's DevCloud for assignments, which will let us consider distributed (multi-core) implementations.

No prior distributed programming experience will be needed, but Python will be the required language for this course.

**Written Questions**

There will be several sets of written questions posted for students to work through with solutions being made available a week after posting. The written questions will not be graded, but it will be important to understand understand the concepts raised in the questions to do well on the quizzes.

**Quizzes**

There will be a three quizzes that will be announced at least a week ahead of time. The quizzes will cover the conceptual and theoretical concepts taught in class.

**Grades**

The final grade will be calculated as follows: Implementation Assignments 70%, Quizzes 30%

**Collaboration**

Pairs of students may work together on the implementation assignments, but students can work individually if they prefer. The instructor and TAs will actively check for copying of code and solutions. The work you (or your team) turns in should be your own. Any violation of these rules will result in failing the course.

**CC Attribution**license. Content in this course can be considered under this license unless otherwise noted.