Key points are not available for this paper at this time.
• Multi-Task Preferential Bayesian Optimization (MTPBO) is proposed for efficient MPC tuning • MTPBO uses human preference data from previous tasks to accelerate tuning for new task • Multi-task initialization suggests promising comparisons to warm-start optimization • Validated on a real human-in-the-loop MPC tuning problem involving four tasks • MTPBO enables transfer learning of preferences across users and open-loop processes The closed-loop performance of Model Predictive Control (MPC) depends on the nontrivial selection of several tuning parameters. Recently, data-efficient methods such as Bayesian Optimization (BO) have been proposed for automatic MPC tuning. In practice, it is often challenging to specify a single objective function to balance multiple criteria, especially when some of these are qualitative in nature. In these cases, Preferential Bayesian Optimization (PBO) can be used as a human-in-the-loop alternative to BO-based automatic tuning methods. By incorporating expressed preferences between pairwise comparisons of different closed-loop responses, PBO searches for the optimum of an underlying utility function that reflects the user’s preferences towards the closed-loop response from various controller parameters. However, standard PBO does not leverage comparison data from previous tasks, resulting in the need to learn preferences from scratch for each new task. In this paper, we introduce Multi-Task PBO (MTPBO) for MPC tuning, which leverages data from previous preference-based controller tuning tasks to accelerate the search of optimal parameters for either a new human user or a similar closed-loop process. Additionally, we introduce a multi-task initialization strategy that enables a more effective warm start by proposing a batch of promising initial experiments for new tasks. The advantages of MTPBO with initialization are shown on benchmark optimization functions and an offset-free MPC tuning problem with feedback from multiple actual human users on different simulated processes. Overall, the proposed MTPBO framework leads to more preferred responses with a lower experimental budget and can be applied to general controller tuning problems.
Coutinho et al. (Wed,) studied this question.