Amaryllidaceae alkaloids (AAs) are a valuable class of plant specialized metabolites with diverse pharmacological properties. However, the discovery of enzymes involved in AA biosynthesis through traditional methods has been subjected to several drawbacks over time, demanding labor-intensive screening and optimized growth conditions. Here, we introduce a Support Vector Machine (SVM)-algorithm-based approach that overcomes these challenges by predicting enzyme-substrate interactions based solely on amino acid sequences and molecular fingerprints. We employed a training set of 90 enzyme sequences, equally balanced, where the positive enzymes were selected based on chemical similarity to the substrate of interest (4'-O-methylnorbelladine (4OMET)), and the negative enzymes corresponded to active enzymes toward 4OMET-decoy molecules. Applying this prediction model to transcriptomic data of Crinum asiaticum bulbs identified 19 putative cytochrome P450 enzymes. Functional assays in heterologous systems showed that five candidates reproducibly depleted 4OMET, including a CYP81-like candidate - a P450 class not previously linked to 4OMET turnover. Overall, this strategy bypasses the need for stringent alkaloid accumulation conditions and precise tissue sampling for enzyme discovery, offering a scalable and cost-effective candidate selection alternative for downstream biochemical characterization and pathway elucidation efforts.
Valderruten-Cajiao et al. (Thu,) studied this question.