Directed evolution dominates enzyme engineering, enabling new specificities, higher catalytic rates, and improved thermal stability. Its success relies on locating target functions within vast sequence space, despite limited diversity and frequent trapping in local fitness landscapes by deleterious epistasis. Exploring distant sequences requires diverse starting points, engineering enzymes with few homologs is challenging. We propose integrating stability inference from biophysical modeling with a high-throughput self-selection scheme to generate a diverse functional enzyme library from a single wild-type sequence. In this PhD thesis, we present two parts of this problem: one computational, the other experimental. In our computational work, we adapted an existing biophysical model, trained on deep mutational scan data, to be more robust to sampling noise by adjusting our method of regularization and introducing external reference values from generalist models. We combined the two to infer mutational stability effect in small binding proteins at reasonable accuracy with four orders of magnitude less data as compared to previously published results. The experimental part of this thesis includes the development of a workflow that incorporates extitin vitro transcription, translation and replication (IVTTR) inside of water-in-oil droplets and quantification through sequencing for the self-selection of the DNK enzyme. We devised a scheme to funnel the output of a wider purifying selection into a second quantitative selection. Our starting variant library was produced with a new DNA shuffling protocol and a codon optimization algorithm, that together form a random synonymized pattern in the input genes which we called DNA watermarks. We carried out several selection campaigns using our scheme and addressed the practical issues we encountered. Our work shows we can infer stability effects from the noisy data expected from self-selection. Although IVTTR-based self-selection was inefficient, we introduced new library design, analysis methods, and controls, creating a more time-efficient scheme. Together, these approaches could transform single wild types into highly diverse starting libraries for future studies.
Building similarity graph...
Analyzing shared references across papers
Loading...
Mats van Tongeren (Thu,) studied this question.
Mats van Tongeren
Building similarity graph...
Analyzing shared references across papers
Loading...