What question did this study set out to answer?

The aim is to explore the potential of unnatural proteins for high-capacity data storage and retrieval.

March 3, 2026Open Access

Data storage and retrieval with unnatural proteins expressed via E. coli

Key Points

The aim is to explore the potential of unnatural proteins for high-capacity data storage and retrieval.
Encoded digital data into amino acid sequences incorporated in collagen-like protein templates.
Expressed proteins via E. coli to achieve stable data storage.
Sequenced proteins using tryptic digestion and LC-MS/MS analysis for complete data recovery.
Demonstrated random access and cryptographic data protection through affinity-tagged proteins.
Achieved successful expression of data-bearing proteins with targeted amino acids.
Established higher stability of data-bearing proteins compared to DNA.
Enabled complete data recovery from mixtures encoding multiple datasets.

Abstract

Data storage using proteins offers high capacity and stability, enabling utilization of protein techniques for data storage and retrieval. However, expressing unnatural proteins with random sequences for data storage and sequencing them for accurate data retrieval remain challenging. In this study, by encoding digital data into amino acid sequences and incorporating them into collagen-like protein templates, we achieve successful expression of the proteins via E. coli for data storage; the data-bearing proteins containing selective amino acids and arginine intervals can be sequenced through tryptic digestion followed by LC-MS/MS analysis to achieve complete data recovery, even for protein mixtures encoding multiple datasets. We further demonstrate much higher stability of the data-bearing protein than DNA, and random access and cryptographic data protection using affinity-tagged proteins. This work establishes a robust framework for protein-based data storage, opening up avenues for data storage and retrieval, protein engineering and chemistry, synthetic biology, proteomics, and beyond.

Bookmark

View Full Paper

Bookmark

View Full Paper

Data storage and retrieval with unnatural proteins expressed via E. coli

Key Points

Abstract

Cite This Study