Little Pluses: Reading list

(Note 2020-01-18: This reading list is outdated.)

Reading

Booth et al.: The Craft of Research

Read

2019-08-08 Security amplification
2019-08-07 Meta-execution
2019-08-06 Reliability amplification
2019-08-02 Techniques for optimizing worst-case performance
2019-08-01 Thoughs on reward engineering
2019-07-30 Learning with catastrophes
2019-07-30 Capability amplification
2019-07-25 The reward engineering problem
2019-07-24 Directions and desiderata for AI alignment
2019-07-23 AlphaGo Zero and capability amplification
2019-07-22 Supervising strong learners by amplifying weak experts
2019-07-16 Benign model-free RL
2019-07-15 Unpublished draft of an article on quantum computing and AI alignment.
2019-07-11 Iterated Distillation and Amplification
2019-07-10 Corrigibility
2019-07-09 Humans Consulting HCH
2019-07-09 Approval-directed bootstrapping
2019-07-05 Understanding Iterated Distillation and Amplification: Claims and Oversight
2019-06-27 Approval-directed agents
2019-06-18 Embedded World-Models and the corresponding section of Embedded Agency. Time-boxed, so I understood only part of it.
2019-06-14 Decision Theory and the corresponding section of Embedded Agency. I didn't understand everything, but aborted, because it was taking too much time.
2019-06-12 Future directions for ambitious value learning
2019-06-12 Model Mis-specification and Inverse Reinforcement Learning
2019-06-11 Latent Variables and Model Mis-Specification
2019-06-10 Humans can be assigned any values whatsoever…
2019-06-07 The easy goal inference problem is still hard
2019-06-05 What is ambitious value learning?
2019-06-05 Preface to the sequence on value learning
2019-06-05 Embedded Agents and the corresponding section of Embedded Agency.
2019-06-03 Prosaic AI alignment
2019-05-31 An unaligned benchmark
2019-05-29 Clarifying "AI Alignment"
2019-05-28 The Steering Problem
2019-05-28 Preface to the sequence on iterated amplification

« Time tracking