Inverse design of thermally active composite via policy-transferred reinforcement learning | Synapse