In the previous blog post we discussed the gorey details of generalised policy learning – the first task of CRL. We went into some very detailed mathematical description of dynamic treatment regimes and generalised modes of learning for data processing agents. The next task is a bit more conceptual and focuses on the question on how to identfy optimal areas of intervention in a system. This is clearly very important for an RL agent where its entire learning mechanism is based on these very interventions in some system with a feedback mechanism. Let’s begin!
- Causal Reinforcement Learning
- Preliminaries for CRL
- CRL Task 1: Generalised Policy Learning
- CRL Task 2: Interventions – When and Where?
- CRL Task 3: Counterfactual Decision Making
- CRL Task 4: Generalisability and Robustness
- (Coming soon) Task 5: Learning Causal Models
- (Coming soon) Task 6: Causal Imitation Learning
- (Coming soon) Wrapping Up: Where To From Here?