Deep reinforcement learning from human preferences | Synapse