This column highlights recently published articles that are of interest to the readership of this publication.
This column highlights recently published articles that are of interest to the readership of this publication. We encourage ABRF members to forward information on articles they feel are important and useful to Clive Slaughter, AU-UGA Medical Partnership, 1425 Prince Avenue, Athens GA 30606. Tel; (706) 713-2216: Fax; (706) 713-2221: Email; [email protected] or to any member of the editorial board. Article summaries reflect the reviewer’s opinions and not necessarily those of the Association.
Detection of sites of epigenetic DNA methylation, principally 5-methylcytosine (5mC) and 5-hydroxymethylcytosone (5hmC), is generally performed by whole-genome bisulfite sequencing; bisulfite-free enzymatic-methyl sequencing; or, more recently, by sequencing with assistance of ten-eleven translocation (TET) dioxygenase to convert 5mC and 5hmC to 5-carboxylcytosine (5caC), and reduction of 5caC to dihydrouracil with pyridine borane. All these methods involve conversions of either the C base, or one of its methylated derivatives, to U (read as T). This approach is problematic because somatic mutation also commonly replaces C by T, so such mutations may be misinterpreted as unmodified cytosines when comparing sequences to a reference genome. C-to-T conversion may also contribute to misalignment with reference sequences. Füllgrabe et al. here present methodology to overcome these problems by assigning all four canonical bases, as well as 5mC and 5hmC, in a single, entirely enzymic workflow. To assign the canonical bases, C, G, A & T, and modified C (modC), hairpins are ligated to double-stranded DNA. The strands are separated, each is copied using Klenow exo-polymerase, and sequencing adaptors are ligated. ModCs are protected by TET enzyme oxidation and glycosylation. Unprotected Cs are then deaminated to U by APOBEC3A cytosine deaminase. Following deamination, the duplexes no longer pair well, so they behave as linear sequences, which are amenable to amplification by polymerase chain reaction. Indexes are then added and the chains subjected to paired-end sequencing. The two end-reads represent the same stretch of DNA, locally aligned, from which the 5-letter sequence may be resolved using the appropriate pairing rules. To discriminate 5mC and 5hmC for 6-letter sequencing, an additional step is performed in this protocol: methylation at 5mC is enzymically copied across CpG units to the C on the copy strand using DNA methyltransferase 5, whereas 5hmC is glycosylated to prevent such copy. This produces distinct pairings of bases between the two ends from which all 6 letters may be resolved. The authors highlight application of the new methodology in a study of cell-free DNA from a patient with stage-III colon cancer. In applications of this kind, epigenetic markers may provide additional prognostic information. The methodology provides epigenomic information complementary to that from the Pacific Biosciences or Oxford Nanopore Technologies long-read methods through provision of the greater sequencing depth and lower error rates – advantages of short-read methods.
Two papers describe the design and capabilities of a new instrument platform from Thermo Fisher Scientific (San Jose, CA). The platform incorporates a novel Asymmetric Track Lossless (Astral) mass analyzer in a hybrid system that combines it with a quadrupole mass filter and an Orbitrap mass analyzer. The Orbitrap performs full precursor scanning, while the Astral analyzer acquires MS/MS spectra. Stewart et al. describe the instrument design. The Astral analyzer works on a time-of-flight (ToF) principle: ion packets are resolved by m/z while traversing an oscillating and reversing tack of >30 m. The short ion measurement period characteristic of ToF analyzers contributes to the acquisition of MS/MS spectra at a high rate (>200 Hz) while maintaining high resolution and high mass accuracy. The analyzer deploys a gridless design and spatial focusing for efficient ion transmission. A detector of novel design records ion arrivals from the Astral analyzer with high dynamic range. The duty cycles of the Orbitrap and Astral analyzers are synchronized to achieve fast scanning. These various features contribute to strong performance in LC-MS analysis that is particularly useful for proteomic applications. The instrument achieves high analytical depth (e.g. >7,600 proteins identified with a chromatographic run time of 4.8 min and sample throughput of 300/day), while maintaining good quantitative performance for purposes such as label-free quantification. Heil et al. investigate how the enhanced throughput benefits protein and peptide quantification. They comment that the Astral quantifies more peptides and proteins than the Orbitrap could do in one-fourth the time. The analytical throughput can therefore be quadrupled using the Astral system without sacrificing quantitative performance. They also find that the speed of the Astral system permits the use of narrower precursor isolation windows. This generates less complex product ion spectra, which are easier to search. Heil et al. deploy a previously published enrichment method to deplete extracellular vesicles from plasma without affecting lipoprotein particles. With this enrichment of plasma proteins, they are able to quantify 5,000 plasma proteins in a single 1 h LC-MS run.
GLYCANS
Black I M, Ndukwe I E, Vlach J, Backe J, Urbanowicz B R, Heiss C, Azadi P. Acetylation in ionic liquids dramatically increases yield in the glycosyl composition and linkage analysis of insoluble and acidic polysaccharides. Analytical Chemistry 95;2023:12851-12858.
Although polar in character, many polysaccharides are insoluble in water or dimethylsulfoxide (DMSO), two solvents commonly used for derivatization in the analysis of composition and structure. Examples of such insoluble polysaccharides are: constituents of plant and fungal cell walls and wood, mucilage from seed coats, and chitin from insect skeletons. Black et al. highlight the use of ionic liquids as solvents for the derivatization of such polysaccharides. Ionic liquids are substances composed predominantly of salts that are in the liquid state at ambient temperature. The authors demonstrate the use of the ionic liquid 1-ethyl-3-methylimidazolium acetate for acetylation of insoluble polysaccharides. They choose acetylation as a derivatization reaction to be performed as a preliminary to further processing because it increases the solubility of glycans in organic solvents such as methanol and DMSO, while the acetyl groups are conveniently removed again under acidic or basic conditions. The acetylation reaction is conducted in the ionic liquid at room temperature with acetic anhydride and 1-methylimidazole, and is complete within 10 min. The authors show that acetylation facilitates hydrolysis of previously insoluble polysaccharides for composition analysis, and protects polysaccharides from β-elimination reactions during permethylation prior to linkage analysis. The procedure is suitable for both soluble and insoluble polysaccharides.
This study builds upon the group’s earlier work demonstrating that protein complexes in a gas-phase ion beam that’s generated in nanoelectrospray mass spectrometry can be recovered in a form suitable for subsequent electron microscope structural analysis by soft-landing on a transmission electron microscope grid coated with poly(propylene) glycol. Unfortunately, this soft-landing process is not directly compatible with cryo-electron microscopy (cryo-EM), which would provide structural information at higher resolution. The present paper describes methodology for cryogenic landing on a transmission electron microscope grid that is suitable for cryo-EM imaging. In a modified Orbitrap mass spectrometer, the grid is held at liquid nitrogen temperature, -190 ˚C, and water vapor is applied to form a thin film of amorphous ice either before or after deposition of the protein. The thickness of the film is calibrated using a quartz crystal microbalance. The instrumentation is configured to allow removal of the sample by isolation of the landing region from the mass spectrometer vacuum system. The resulting image quality is substantially better than that obtained with ambient temperature landing. Although structural resolution is still not as good as that achieved with conventional plunge-frozen preparation for cryo-EM, the system now allows systematic experimentation to investigate the potential sources of the limitation to further improve image quality.
Differential interaction between membrane proteins and the diverse lipids that are present in membranes affect protein functional status by altering protein conformation, protein-protein interactions, and distribution of proteins between domains of different lipid composition within the plane of the membrane. Two groups here investigate the interactions between membrane proteins and membrane lipids by adapting methods originally developed with detergents or simple model lipids for use instead with diverse endogenous membrane lipids. However, the two groups employ very different approaches. Panda et al. use a native mass spectrometry approach in which liposomes of varying lipid composition, liposome size and protein-to-lipid ratio are created to emulate the properties of diverse biological membranes. Mass spectrometry is performed directly on these liposomes to determine the oligomeric state of the proteins and the identity of lipids they bind. In their present study, the authors determine how the transmembrane domain of a single integral membrane protein, VAMP2 (a protein also known as synaptobrevin or V-SNARE, that has a single transmembrane domain, and functions in driving vesicular fusion) binds to the diverse lipid constituents of the membranes it encounters on its biosynthetic journey from endoplasmic reticulum to Golgi to synaptic vesicle to plasma membrane. The authors show that this protein binds lipids with different selectivity in membranes of different composition mimicking different organellar environments. They then determine how the oligomeric state of a different integral protein, SemiSWEET (a sugar transporter of Gram-negative bacteria) is affected by lipid composition. They deduce that the presence of cardiolipin maintains the protein in a dimeric state, and exerts its effect not by its propensity to induce membrane curvature but by its negative charge. Walters et al. use a protein NMR approach to investigate the interaction of peripheral membrane proteins with membrane lipids. Their methodology is to trap peripheral membrane proteins inside reverse micelles of native lipid composition. Reverse micelles are spherical, single-layered liposomes with polar head groups on the inside interfacing with a nano-pool of confined water and protein. The hydrophobic groups lie on the outside interfacing with an organic solvent such as pentane or hexane. The authors characterize the interactions of three peripheral membrane proteins with diverse lipid head-groups.
One of the challenges in analyzing membrane proteins by mass spectrometry is separating the proteins from the surfactants during translocation to the gas phase, a step that is required for both native mass spectrometry, and sequence analysis. Collisional-induced activation is frequently employed for this purpose, often as part of process of collision-induced dissociation that cleaves peptide bonds for sequence analysis. Juliano et al. here investigate the use of infrared (IR) photoactivation as an alternative method for the dissociation of surfactants. They use as model systems micelles, bicelles, nanodiscs and liposomes formed between integral membrane proteins and a variety of detergents or lipids. They observe that IR activation is an efficient method for removal of surfactant prior to native mass spectrometric analysis or collisional dissociation. Energy input is readily tuned for different applications. Product ion spectra are simplified by removal of background from surfactants, and sequence coverage of some proteins is improved.
Two papers provide perspective on the perpetual problem of missing values in proteomic comparison studies, and what imputation can do to help. The problem is increasingly pressing in the setting of rising interest in analyses with extremely low sample input, e.g. single-cell proteomics. Imputation is the process of replacing missing data with substituted values derived from other available information. Imputation should be deployed with great caution for reasons that include the introduction of bias and false signals. Nonetheless, many single-cell proteomic procedures necessarily include imputation. Vanderaa & Gatto provide a Perspective on the advantages and disadvantages of imputation and alternative procedures. They identify specific challenges encountered in single-cell proteomics: the high proportion of missing values, the diversity of data types, the existence of cell-to-cell heterogeneity, strong batch effects, and differing causes of missing values. They provide recommendations for how to report the sensitivity and consistency of data from single cell studies, and how to describe the way missing values have been dealt with. Harris et al. perform a benchmarking study to evaluate commonly used imputation methods, specifically to determine how well alternative imputation methods contribute to identification of differentially expressed peptides, to what extent they increase the number of quantifiable peptides, and how well they improve the lower limits of peptide quantification. The authors conclude that imputation does not necessarily improve capability to identify differentially expressed peptides, but it can identify new peptides amenable to quantification, and improve lower limits of peptide quantification. However, they caution that existing imputation methods do not properly account for observed variance in peptide quantification, highlighting a need for further development of methods in this area.
RNA interference (RNAi) has long been the standard method for targeted RNA knockdown in eukaryotes. However, this method may produce unintended cleavage of RNAs that have partial complementarity to the intended target transcript. RNAi is inefficient in targeting nuclear RNAs (because the Argonaute nucleases it uses are principally cytoplasmic), and it is incompatible with certain model eukaryotic systems, e.g. budding yeast and zebrafish embryos. Colognori et al. overcome these limitations with a system for specific RNA ablation that employs the clustered regularly interspaced short palindromic repeats (CRISPR)-Csm complex, which uses a programmable RNA-guided system. This complex is a Type III CRISPR system. As such, it targets RNA as well as DNA, and acts as a cyclic polyadenylate (cA) synthase. With 5 different subunits, the Csm nuclease is more complex than the more familiar, single-subunit Cas9 nuclease used for gene editing. But Csm is advantageous in not requiring proto-spacer adjacent motifs (PAM) for target selection, and not cleaving nearby RNAs in trans. Colognori et al. deliver the CRISPR-Csm system from Steptococcus thermophilus to human cells on a single vector, and show highly efficient (90-99%) knock-down of RNAs with minimal off-target effects. And they demonstrate nuclear as well as cytoplasmic targeting. There is no observable cytotoxicity such as that responsible for the failure of RNAi in zebrafish embryos. Ablation of DNase and cA synthase activities does not affect RNA ablating activities. These results indicate CRISPR-Csm to be a highly promising tool for RNA knock-down that is superior to methods presently in use. The authors are working to adapt it for possible lentiviral delivery.
Expansion microscopy is a procedure in which a cell or tissue sample to be imaged is expanded isotropically in order to increase image resolution. Chemically reactive groups in the sample (e.g. protein lysine side-chains) are covalently conjugated to a permeating, polyelectrolyte hydrogel that swells upon addition of water. Damstra et al. contribute to this methodology by developing a fluorescent reference grid that may be embedded in the hydrogel before expansion to serve as an expandable ruler for measuring expansion factors and correcting for local anisotropy that causes sample deformation. The grid is created by patterning a coverslip with a protein by means of a photolithographic process. The cover slip may serve as a substrate on which cultured cells are grown, or as a surface to which a tissue slice is applied during preparation for microscopy. The protein grid is revealed by fluorescent immunologic staining. The grid is two-dimensional only, so it does not provide direct information about expansion along the third axis. Nevertheless, the authors anticipate that this methodology will provide a convenient general procedure for quality control in expansion microscopy.
Pownall et al. employ expansion microscopy for high resolution visualization of chromatin and its interactions with RNA polymerase II, enhancers, promoters, and their protein ligands on a whole-nucleus scale. The methodology is used to infer how molecular interactions and chromatin architecture change during early development of zebrafish embryos. To visualize chromatin, labeling must be sufficient to overcome the effects of diminished molecular crowding and reduced brightness that result from the volume expansion. This is achieved by metabolic labeling and click chemistry: metabolic labeling with (2′S)-2′-doexy-2′-fluoro-5-ethyluridine, followed by fluorescent picolyl azide detection after expansion. The authors demonstrate preservation of chromatin structure under these conditions according to several criteria. They achieve a linear expansion factor of ∼15x for nuclei from embryos. With confocal microscopy, this provides a lateral resolution of ∼15 nm, sufficient for resolving chromatin fibers, but not nucleosomes (∼10 nm diameter). With stimulated emission depletion (STED), however, super-resolution of ∼3 nm is achieved. Using this methodology, Pownall et al. visualize transcription elongation as string-like structures decorated with multiple polymerase II molecules, and provide evidence that contacts between enhancers and promoters form transiently, and get released when transcriptional elongation begins. The methodology opens new avenues for investigation of transcriptional dynamics in vivo.
For the purposes of de novo protein design, Watson et al. explore the use of a deep-learning methodology called denoising diffusion probabilistic modeling (DDPM), which has been notably successful in generating new, realistic images, but has hitherto met with limited success in designing new proteins. DDPMs are generative machine-learning models: viz. they learn to generate new data, rather than making predictions about, or classifying existing datasets. Diffusion models gradually add Gaussian noise to an existing dataset to slowly and systematically destroy its structure (the diffusion process), and then reverse (denoise) the data, employing a set of rules to rebuild structured datasets in new forms to fulfill defined criteria. The authors attribute the previous disappointing results to the limited ability of denoising networks to generate realistic protein backbones. But reasoning that new structure prediction methods such as RoseTTAFold (RF) incorporate excellent understanding of protein structure, they deploy RF as the denoising network in a protein design DDPM instead. Here, they fine-tune RF to yield excellent results in unconditional and topologically constrained protein monomer design, design of protein binders and symmetric oligomers; and scaffolding of enzyme active sites and symmetric motifs for metal-binding proteins and therapeutic proteins. They confirm the accuracy of the new methodology in a cryoEM study of a protein designed to bind influenza hemagglutinin. This study demonstrates identity of the designed structure with the experimentally ascertained structure. The authors anticipate that the new RF diffusion methodology will enable design of varied functional proteins in accord with simple molecular specifications.
A collection of recent papers documents a milestone in progress toward creating a eukaryote, Saccharomyces cerevisiae, with a fully synthetic genome, a task undertaken by the Synthetic Yeast Genome Project (Sc2.0) consortium. The project is expected to yield fundamental insights into the way genomes function in cells, new methodology (both experimental and computational) for manipulation of genomes, and new platforms for biomedical and industrial production. Arguably, the enterprise itself will additionally stimulate inter-laboratory collaboration, inter-professional collaboration, and beneficial communication between scientists and the general public. The genome of the new yeast strain comprises >50% synthetic DNA. Of the 16 chromosomes in wild-type yeast, 6.5 are edited and synthesized in the laboratory. An additional chromosome that collects together genes distributed among several wild-type chromosomes brings the total number of synthetic chromosomes to 7.5. Foo et al. describe the evolution of methodology for designing a neo-chromosome with its recoded genes, inserted features and modified elements; building the redesigned chromosome; testing it; and incorporating the lessons learned into subsequent development cycles. Zhao et al. show how assembly of the individual neo-chromosomes into a single strain is performed stepwise by successive crosses with strains containing one neo-chromosome each, followed at each step by localization and editing to solve problems created by unanticipated epistatic effects. Lauer et al. focus on chromosome VIII. Of particular interest are their observations of the effects on chromosome stability of centromere location and sequence context. These observations provide new knowledge about centromere function. Schindler et al. describe the design, construction and characterization of the entirely new chromosome in Sc2.0. This encodes all 275 nuclear rRNAs of yeast. tDNAs and their associated retrotransposons and repetitive sequences are known to act as foci of genomic instability. Their relocation and editing is intended to address this instability issue. The consortium is now working toward integration of the remaining synthetic chromosomes, a process whose completion is anticipated to be imminent.
The replacement of a carbon atom by a nitrogen atom in an aromatic ring system is a transformation of key interest in drug development because such replacements produce changes in polarity that affect drug interactions with proteins which can be exploited to modulate target specificity, metabolic stability and solubility. Heterocyclic compounds containing nitrogen atoms are accordingly widespread amongst approved medicines. Unfortunately, such skeletal editing commonly requires arduous, parallel resynthesis, and becomes increasingly difficult as the number of nitrogen atoms in the skeleton increases. Consequently, there is intense interest in the search for general strategies for streamlined C-to-N transmutation. Woo et al. describe such a strategy for the conversion of quinolines to quinazolines. In a ‘one-pot’ process, they perform both nitrogen insertion and carbon deletion. This is achieved by initial N-oxidation, rearrangement of the oxidized species by a mild and selective ultraviolet light-emitting diode, forming a product that, upon ozonolysis, undergoes ring opening with formation of two reactive carbonyl groups. In the presence of ammonia as a nitrogen source, these carbonyl groups react to reform a ring containing a new nitrogen atom. This strategy is broadly applicable to different quinolines and related azaarenes. The work encourages optimism that even more general skeletal editing strategies might be formulated in the future.