Learning to Optimize Multi-Objective Alignment Through Dynamic Reward Weighting | Synapse