Article Watch: September, 2021

Clive Arthur Slaughter

doi:doi:10.7171/jbt.21-3203-018

Abstract

This column highlights recently published articles that are of interest to the readership of this publication. We encourage ABRF members to forward information on articles they feel are important and useful to Clive Slaughter, MCG-UGA Medical Partnership, 1425 Prince Avenue, Athens GA 30606. Tel; (706) 713-2216: Fax; (706) 713-2221: Email; [email protected] or to any member of the editorial board. Article summaries reflect the reviewer’s opinions and not necessarily those of the Association.

NUCLEIC ACID SEQUENCING

Xin R, Gao Y, Gao Y, Wang R, Kadash-Edmondson K E, Liu B, Wang Y, Lin L, Xing Y. isoCirc catalogs full-length circular RNA isoforms in human transcriptomes. Nature Communications 12;2021:266.

Zhang J, Hou L, Zuo Z, Ji P, Zhang X, Xue Y, Zhao F. Comprehensive profiling of circular RNAs with nanopore sequencing and CIRI-long. Nature Biotechnology 39;2021:836-845.

Circular RNAs (circRNAs) are formed during mRNA splicing. They are produced when an upstream 5’ donor splice site gets linked not to a downstream 3’ acceptor splice site as in canonical splicing but instead to an upstream 3’ splice site (back-splicing). This forms a circle which is closed by a junction between exons in reverse order. More than 183,000 circRNAs have already been identified from human transcriptomes. They are expressed at low levels, but sometimes in a tissue-specific manner. They modulate transcription and splicing, bind miRNAs and proteins, and may synthesize polypeptides. They are involved in cell proliferation and transformation, neuronal and innate immune function, and in pathogenesis when mis-regulated. Progress in research on circRNAs has been limited by the difficulty of determining their full-length sequences using short-read sequencing alone, but two groups now provide optimized strategies for full-length sequencing of circRNAs using nanopore long-read sequencing. Both groups prepare circRNA by depleting ribosomal RNA and degrading linear RNA by digestion with RNAse R. They amplify the remaining RNA by rolling-circle reverse transcription with random primers. Nanopore sequencing of the resulting linearized libraries yields multiple copies of each template sequence initiated from random positions within the circRNA during the reverse transcriptase reaction. The products may include tandem repeats. From these raw reads consensus sequences are generated and the positions of back-splice and canonical splice junctions are deduced. Using such a protocol, Xin et al. produce a catalog of over 107,000 full-length circRNA sequences derived from 12 human tissues and 1 human cell line. They document alternative splicing events within circRNAs, including retained introns. Zhang et al. analyze circRNAs from mouse brain, including species from the mitochondrial genome. These studies provide a template for future research to document circRNAs.

Abascal F, Harvey L M R, Mitchell E, Lawson A R J, Lensing S V, Ellis P, Russell A J C, Alcantara R E, Baez-Ortega A, Wang Y, Kwa E J, Lee-Six H, Cagan A, Coorens T H H, Chapman M S, Olafsson S, Leonard S, Jones D, Machado H E, Davies M, Øbro N F, Mahubani K T, Allinson K, Gerstung M, Saeb-Parsy K, Kent D G, Laurenti E, Stratton M R, Rahbari R, Campbell P J, Osborne R J, Martincorena I. Somatic mutation landscapes at single-molecule resolution. Nature 593;2021:405-410.

Reliable detection of mutations in single somatic cells requires a very high standard of sequencing accuracy, which has proven difficult to attain. Nonetheless, study of the rate and pattern of somatic mutation is of particular interest for understanding processes such as human aging, neurodegenerative and cardiovascular disease. Sequencing accuracy has been increased by barcoding individual DNA molecules and sequencing each molecule many times to produce a consensus sequence, and has been yet further improved by duplex consensus sequencing. In the latter process, errors are corrected by sequencing copies of both strands. While investigating the accuracy of one such duplex sequencing protocol, BotSeqS, Abscal et al. noticed a large excess of consensus calls from only one of the two strands, with almost complete asymmetry with the frequency of the complementary substitutions on the second strand. Such changes cannot be caused by real mutations: they must reflect DNA damage either in vivo or during library preparation. Their preponderance near the 5’ ends of DNA fragments suggests they originate from technical artefacts introduced during library preparation, probably resulting from end repair. The authors accordingly develop a new protocol called NanoSeq that eliminates such artefacts by avoiding end repair and by blocking nick extension. Instead of sonication and end repair they fragment with restriction enzymes, or use sonication with exonuclease blunting. They also perform A-tailing by replacing dATP with a mixture of dATP and the dideoxynucleotides ddCTP, ddGTP & ddTTP. Any internal nicks trigger DNA polymerase extension until a dideoxynucleotide is incorporated, whereupon the affected DNA strand is rendered unamplifiable. These modifications reduce error rates to less than 5 per billion base pairs, 2 orders of magnitude lower than typical somatic mutation loads. The authors employ their improved methodology to compare somatic mutation rates between stem cells and non-dividing cells in tissues, and unexpectedly find the mutation loads to be very similar. They also show that mutations accumulate in non-dividing neurons and polyclonal smooth muscle at a constant rate throughout life. These results indicate that the mutational process is independent of cell division.

Datlinger P, Rendeiro A F, Boenke T, Senekowitsch M, Krausgruber T, Barreca D, Bock C. Ultra-high-throughput single-cell RNA sequencing and perturbation screening with combinatorial fluidic indexing. Nature Methods 18;2021:635-642.

Datlinger et al. explore scale-up of single-cell RNA sequencing (scRNA-seq) by combinatorial indexing coupled with the use of droplet generators. Combinatorial indexing in microwell plates is a well-established technique. The authors begin with a first round of indexing with permeabilized bulk aliquots of cells or nuclei in a 384-well plate. The wells receive a well-specific, unique molecular identifier and a primer binding site during reverse transcription. The pre-indexed, permeabilized cells are then pooled and loaded into a standard microfluidic droplet generator (10X Genomics Chromium) at a density that permits several cells or nuclei per droplet. Inside the droplets, the cells/nuclei are lysed and oligonucleotides carrying microfluidic barcodes are delivered via barcode-specific beads for ligation. The droplet emulsion is then broken for sequencing library preparation in bulk. The combination of two barcodes uniquely identifies transcripts derived from the same single cell. Naming the methodology ‘single-cell combinatorial fluidic indexing,’ provides a rationale for the acronym scifi-RNA-seq. Allowing many cells per droplet provides a convenient way to multiplex thousands of samples in a single experiment: the authors demonstrate up to 150,000 single-cell transcriptomes per channel in the Chromium system (more than 1 million single-cell transcriptomes per chip with eight channels). They anticipate that the capacity for scale-up will be useful in cell atlas projects, single-cell transcriptomics in large-scale cohorts of patients, clustered regularly interspaced short palindromic repeat (CRISPR) screens, and drug screens with single-cell read-out.

GLYCANS

Cui Y, Tabang D N, Zhang Z, Ma M, Alpert A J, Li L. Counterion optimization dramatically improves selectivity for phosphopeptides and glycopeptides in electrostatic repulsion-hydrophilic interaction chromatography. Analytical Chemistry 93;2021:7908-7916.

Cui et al. use ion pairing to improve the selectivity of hydrophobic interaction chromatography (HILIC) for enrichment of glycopeptides and phosphopeptides from protein digests. Such normal-phase separations are generally conducted with an anion exchange column at acidic pH to suppress the charge on most acidic side-chains. The trifluoroacetate anion has long been employed to ion-pair with basic residues in peptides to de-emphasize their contribution to retention in HILIC. Cui et al. also make use of the well-hydrated Mg²⁺ cation to bolster the retention of peptides which have functional groups that remain negatively charged at the chosen low pH, notably phosphate groups (pK_a ~2.1), sialyl groups (pK_a ~2.6), and isoaspartyl groups (pK_a ~3.1) resulting from asparagine deamidation. Using magnesium trifluoroacetate as a mobile phase additive during gradient elution, the authors produce fractions enriched in peptides with phosphate, mannose-6-phosphate, and N- and O-glycans. This methodology provides an alternative to conventional affinity enrichment methods involving titanium dioxide, immobilized metal affinity chromatography (IMAC), lectins, or boronic acids. The co-enrichment of glycopeptides and phosphopeptides will be useful in some applications.

Persson A, Nikpour M, Vorontsov E, Nilsson J, Larson G. Domain mapping of chondroitin/dermatan sulfate glycosaminoglycans enables structural characterization of proteoglycans. Molecular & Cellular Proteomics 20;2021:100074.

Structural analysis of glycosaminoglycans (GAGs) remains laborious. Persson et al. illustrate the use of a new workflow for the characterization of the chondroitin/dermatan sulfate (CS/DS) class of GAGs. These molecules have 25-100 repeating disaccharide units consisting of N-acetylgalactosomine (GalNAc) with either glucuronic acid (GlcA) or iduronic acid (IdoA) in characteristic linkages. These residues may undergo O-sulfation at characteristic positions on the monosaccharides. CS/DS GAGs are comprised of three domains: the non-reducing end furthest away from the protein core, an internal oligosaccharide domain, and a tetrasaccharide linkage region. The linkage region consists of a GlcA residue, 2 galactose (Gal) residues that may be sulfated or sialylated, and a xylose (Xyl) residue that may be phosphorylated. The xylose attaches to a serine residue by O-linkage. The authors’ methodology is conceived as mapping of these three domains. They tackle analysis of chromogranin-A (CgA) secreted by rat insulinoma cell line – the main GAG product of these cells. Their methodology for GAG purification features benzonase digestion to remove contaminating oligonucleotides and hyaluronidase digestion to remove hyaluronic acid. These substances would otherwise interfere with mass spectral data interpretation and with the activity of reagent enzymes for depolymerization of the GAGs. The authors also use pronase to hydrolyse the protein moiety rather than removing it by chemically harsh β-elimination. They employ C4 reverse-phase chromatography for GAG purification to enable resolution of long chains. Depolymerization by bacterial lyases yields products from the different domains. GAGs are very heterogeneous, yet they have key functions, e.g. cell adhesion, cell signaling. Along with detailed procedures for analysis of the individual domains, this methodology contributes to the study of this important class of cellular products.

Pepi L E, Leach F E, Klein D R, Brodbelt J S, Amster I J. Investigation of the experimental parameters of ultraviolet photodissociation for the structural characterization of chondroitin sulfate glycosaminoglycan isomers. Journal of the American Society for Mass Spectrometry 32;2021:1759-1770.

Noting that the distinction between different classes of GAGs can be quite subtle and difficult to discern by mass spectrometry, yet may be biologically meaningful, Pepi et al. explore ways to improve existing fragmentation techniques to distinguish them. Following up on the group’s earlier studies of ultraviolet photodissociation (UVPD) (Klein D et al. Analytical Chemistry 91;2019:6019-6026), they seek methods to distinguish chondroitin sulfate from dermatan sulfate, which differ only in their stereochemistry at C-5 of their uronic acids. Specifically, they investigate experimental parameters employed for UVPD in negative ion mode: laser wavelength (193 or 213 nm) and number of laser pulses, the use of low-pressure versus high pressure trap, and the precursor ionization state. Interestingly, they find that precursor charge-state has the biggest effect. A precursor with one more deprotonated site than the number of sulfate modifications reduces sulfate decomposition while maintaining informative ring fragmentation. All investigators using UVPD will find the optimization of the technique in this instance to be informative.

METABOLOMICS

Alseekh S, Aharoni A, Brotman Y, Contrepois K, D’auria J, Ewald J, C. Ewald J, Fraser P D, Giavalisco P, Hall R D, Heinemann M, Link H, Luo J, Neumann S, Nielsen J, Perez De Souza L, Saito K, Sauer U, Schroeder F C, Schuster S, Siuzdak G, Skirycz A, Sumner L W, Snyder M P, Tang H, Tohge T, Wang Y, Wen W, Wu S, Xu G, Zamboni N, Fernie A R. Mass spectrometry-based metabolomics: a guide for annotation, quantification and best reporting practices. Nature Methods 18;2021:747-756.

Metabolites are tremendously diverse in both chemical structure and abundance, and their number is undefined. For these reasons, we cannot seek comprehensive coverage in metabolomics. Instead, metabolomic methodology is directed toward particular aims such as targeted metabolite analysis, metabolite profiling, or flux analysis. In such settings, the establishment of guidelines for acquisition and reporting of metabolite data is challenging. The authors of the present effort acknowledge prior attempts such as the Metabolomics Standards Initiative but recognize that few studies follow the required standards in their entirety, so data are submitted to the metabolome databases less often than one would like. Although compliance with reporting and archiving standards is useful for the community, the present authors hold that evaluation of data can be simplified and compliance thereby improved. They emphasize the central importance of two requirements: analysis of the quality of metabolite annotation, and assessment of the quantitative recovery of analyte peaks. The authors provide reporting guidelines for processed data rather than raw chromatograms, albeit suitably supported by the provision of representative chromatograms to document the quality of metabolite identification. They also insist on the need for quantification control experiments to assess the problematic effects of ion suppression. Additionally, they supply guidelines for sampling, extraction and storage, and suggest a stricter nomenclature for metabolite annotation. It is hoped that adoption of these new recommendations will improve the quality and cross-laboratory comparability of metabolomic datasets and thereby stimulate confidence in the conclusions drawn.

MASS SPECTROMETRY

Schneider B B, Javaheri H, Bedford L, Covey T R. Sampling efficiency improvement to an electrospray ionization mass spectrometer and its implications for liquid chromatography based inlet systems in the nanoliter to milliliter per minute flow range. Journal of the American Society for Mass Spectrometry 32;2021:1441-1447.

Javaheri H, Schneider B B. Ion guide for improved atmosphere to mass spectrometer vacuum ion transfer. Journal of the American Society for Mass Spectrometry 2021: online ahead of print.

During combined liquid chromatography-mass spectrometry (LC-MS) with electrospray ionization (ESI), the ionization efficiency is generally high, but the efficiency with which ions are transported to the high-vacuum aperture for entry to the mass analyzer – the sampling efficiency - depends strongly on the liquid flow rate from the LC. Sampling efficiency in the nanoflow regime (~300 nL/min) approaches 100%, whereas that at milliflow rates (~100-500 µL/min) may be as low as 1%. The losses occur through scattering of charged droplets from the electrospray emitter. Scattering results from electrostatic repulsion between droplets within the ion source and from gas flow streams that divert droplets from the vacuum orifice. Schneider et al. describe optimization of a prototype interface that ameliorates these effects. The interface employs an enlarged inlet orifice, increased gas flow, and control of ion trajectories by a multipole RF ion guide. The ion guide design is described by Javaheri et al. With this assembly, the sampling efficiency difference between nanoflow and microflow narrows to 3x and that between nanoflow and milliflow to 13x. The capability to use higher LC flow rates conferred by the system represents a significant gain in convenience of LC separation.

PROTEOMICS

Messner C B, Demichev V, Bloomfield N, Yu J S L, White M, Kreidl M, Egger A-S, Freiwald A, Ivosev G, Wasim F, Zelezniak A, Jürgens L, Suttorp N, Sander L E, Kurth F, Lilley K S, Mülleder M, Tate S, Ralser M. Ultra-fast proteomics with Scanning SWATH. Nature Biotechnology 39;2021:846-854.

In epidemiologic or quantitative proteomic studies, the capability to analyze sufficient numbers of individuals or samples to gain the statistical power for making valid biological conclusions often depends upon high sample throughput. Messner et al. demonstrate a platform capable of sustaining very high sample throughput while maintaining or surpassing available proteome analysis depth. Their methodology is based upon data-independent acquisition of MS/MS spectra by Sequential Window Acquisition of all Theoretical fragment ion spectra (SWATH). In this modality, broad m/z windows of precursor ions are successively collected for fragmentation during LC-MS/MS on a cyclic basis, and the contributions of individual precursors to the resulting assemblages of fragments are computationally resolved using their differential change in intensities over chromatographic time. Peptides are identified by database matching in the absence an accurate measurement of precursor mass. Messner et al. here adapt a version of SWATH analysis originally published by Moseley et al. (Journal of Proteome Research 17;2018:770-779) in which precursors are not collected in a succession of discrete (although partially overlapping) m/z windows, but by sliding the m/z window continuously – a process known as scanning SWATH. As the window slides past a given precursor, the fragments derived from it appear, increase in strength, decrease, and finally disappear again. This modality extends the capability of conventional SWATH by allowing the fragments from co-eluting peptides of different mass within the window to be distinguished, and enables a precursor mass for each fragment to be assigned. Messner et al. perform this analysis on a quadrupole-time-of-flight (Q-TOF) mass spectrometer. The quadrupole generates the sliding 10-m/z (10 Th) window. The fragments are continually read out in a fast-scanning TOF sector. The TOF data are written into bins that correspond to defined precursor m/z ranges. This is done by summing all TOF pulses that overlap the defined range of the bin as generated by the quadrupole. The authors perform chromatography at high flow rate (800 µL/min) with rapid gradients (5 min duration), allowing a sustainable throughput of 180 proteome injections/day on a single instrument. In a comparison experiment, scanning SWATH identified 4,394 unique proteins, conventional stepped SWATH identified 3,568, and data-independent FAIMS identified 3,594, all with comparable gradient length. The methodology therefore benefits both throughput and proteome coverage. It can also be used to improve coverage with longer gradients.

FUNCTIONAL GENOMICS/PROTEOMICS

Prensner J R, Enache O M, Luria V, Krug K, Clauser K R, Dempster J M, Karger A, Wang L, Stumbraite K, Wang V M, Botta G, Lyons N J, Goodale A, Kalani Z, Fritchman B, Brown A, Alan D, Green T, Yang X, Jaffe J D, Roth J A, Piccioni F, Kirschner M W, Ji Z, Root D E, Golub T R. Noncanonical open reading frames encode functional proteins essential for cancer cell survival. Nature Biotechnology 39;2021:697-704.

The human genome contains ~17,600 protein-coding genes that have been confirmed by mass spectrometric identification of their polypeptide products, and a further ~2,100 unconfirmed protein-coding genes. However, ribosome profiling indicates that a large number of additional, non-canonical open reading frames (ORFs) encode RNAs that are translated into polypeptides; this despite their present annotation as noncoding RNAs or pseudogenes or their location within the 5’ or 3’ untranslated regions (UTRs) of canonical protein-coding genes. The present study expands the number of such ORFs and describes methodology for characterizing them. The authors study a list of 553 likely candidates. They search publicly available mass spectrometry datasets for peptides derived from the ORFs, and test translatability from a cDNA expression library of all 553 sequences by immunochemical methods. These studies provide evidence for stable expression of 334 of the ORFs. The authors test the biological effect of all 553 ORFs by transducing each into 4 different cell lines by lentiviral vectors and screening the effects on transcription of ~1000 genes using the Broad Institute L1000 mRNA profiling assay. In 48 cases, transcription was significantly perturbed, and the perturbation was abrogated by mutation of the translational start site, indicating that protein expression was responsible. The authors conduct loss-of-function viability screens targeting the 553 ORFs in cancer cell lines by CRISPR/Cas9 knockout. Knockout of 57 of the ORFs inhibited cell growth. Detailed tiling studies are conducted to localize the growth inhibitory effect in selected cases. Thirteen ORFs scored highly in all three of these high-throughput assays (translation effect, bioactivity and CRISPR vulnerability). The assemblage of methods employed in this study is expected to provide a useful template for future work on non-canonical ORFs.

Jiao C, Sharma S, Dugar G, Peeck N L, Bischler T, Wimmer F, Yu Y, Barquist L, Schoen C, Kurzai O, Sharma C M, Beisel C L. Noncanonical crRNAs derived from host transcripts enable multiplexable RNA detection by Cas9. Science 372;2021:941-948.

Clinical tests that exploit the specificity with which DNA or RNA can be cleaved by CRISPR-based systems have proved eminently practicable. The results encourage the hope that highly sensitive point-of-care tests that require no instrumentation or fluidic handling may yet be developed. Jiao et al. here describe an implementation of CRISPR-Cas9 amenable to extensive scale-up for the performance of multiple clinical biomarker tests in a single reaction. The authors were investigating the CRISPR system of Campylobacter jejuni, a type II system. Such systems use a trans-activating CRISPR (tracrRNA) to hybridize to the ‘repeat’ portion of each crRNA in the host cell’s endogenous CRISPR system, then processes the crRNA for use as a guide RNA, which hybridizes to an invading nucleic acid for cleavage by a Cas9 enzyme. While sequencing the crRNAs in C. jejuni, the authors made an unexpected and remarkable discovery. They found, bound to the Cas complex, endogenous host cell RNA transcripts in addition to RNAs from the CRISPR system itself. The physiological significance of this for the bacterium is unknown, but the authors anticipated that they’d be able to make use of the ability of tracrRNA to bind to semi-complementary RNAs from various sources. They reprogramed tracrRNAs to create functional guide RNAs from such non-canonical RNAs. These guide RNAs are then allowed to bind to and cleave fluorescent DNA sensors as a means to detect the presence of cognate RNA sequences. This methodology provides a non-collateral means of identifying multiple transcripts. The authors therefore name this platform LEOPARD (leveraging engineered tracrRNAs and on-target DNAs for parallel RNA detection). They use gel electrophoresis or a bioanalyzer to specifically detect 9 RNA fragments from respiratory viruses, including 2 from SARS-Cov-2, 6 from other coronaviruses, and 1 from influenza H1N1. They also distinguished the Asp614Gly variant of SARS-Cov-2 to demonstrate specificity for a single-base change. The authors anticipate that incorporation of microarrays or high-throughput sequencing will allow up to millions of targets to be detected simultaneously by linking the presence of particular RNAs to the binding of labeled Cas9 or the cleavage of DNA targets at specific locations on a microarray. They envision that future developments of LEOPARD will provide multiplexed diagnostic tools for detection of viral variants, screening for cancer mutations, identifying pathogens or antibiotic resistance markers, and recognizing drug susceptibility profiles in gene expression analyses.

Nasser J, Bergman D T, Fulco C P, Guckelberger P, Doughty B R, Patwardhan T A, Jones T R, Nguyen T H, Ulirsch J C, Lekschas F, Mualim K, Natri H M, Weeks E M, Munson G, Kane M, Kang H Y, Cui A, Ray J P, Eisenhaure T M, Collins R L, Dey K, Pfister H, Price A L, Epstein C B, Kundaje A, Xavier R J, Daly M J, Huang H, Finucane H K, Hacohen N, Lander E S, Engreitz J M. Genome-wide enhancer maps link risk variants to disease genes. Nature 593;2021:238-243.

This paper updates work on the identification of enhancer elements (Fulco CP et al. Nature Genetics 51;2000:1664-1669) previously described by the same group. In their earlier study, the authors developed a high-throughput experimental approach for enhancer identification in which hundreds of putative enhancers for a gene of choice are perturbed by CRISPR interference (CRISPRi). The effects on expression of the RNA of interest are then detected in single cells by RNA fluorescence in situ hybridization (FISH) as measured by fluorescence-activated cell sorting (FACS). A library of guide RNAs (gRNAs) is transduced into the cells for repression of the selected target sequences by an inducible system for Cas9 expression. The authors tested >3,500 putative enhancers for a total of 30 genes. They showed that enhancers often regulate more than one gene, found that most enhancers with detectable effects lie within 100 kb of their target promoters, and determined that enhancers can have quantitatively widely ranging effects, including many elements with small effects. The study was significant in measuring effects on endogenous genes as the reporters rather than relying on surrogate positional or conformational/contact data, or relying on expression of exogenous fluorescent proteins. The authors went on to propose a model for prediction of enhancer connections called activity-by-contact (ABC). Predictions are based on measured enhancer activity weighted according to the frequency of 3-D contact with the target gene. Nasser et al. now expand their study by compiling enhancer-gene maps for 131 human cell types and tissues. They use their maps for interpretation of genome-wide association (GWAS) results in terms of identifying the target gene for 5,036 GWAS signals. The database and methodology are expected to facilitate future studies seeking to connect disease variants with physiologic function.

MACROMOLECULAR CHARACTERIZATION

Sharp J S, Chea E E, Misra S K, Orlando R, Popov M, Egan R W, Holman D, Weinberger S R. Flash Oxidation (FOX) System: A novel laser-free fast photochemical oxidation protein footprinting platform. Journal of the American Society for Mass Spectrometry 32;2021:1601-1609.

Hydroxyl radical protein foot-printing has the potential to contribute substantially to the study of protein topography and protein interactions. Rapid protein labeling for this purpose can be accomplished by fast photochemical oxidation of proteins using photolysis products of hydrogen peroxide induced with a UV laser. However, this process is difficult to control because the hydroxyl radicals upon which it depends are very broadly reactive and are therefore vulnerable to scavenging by any organic additives in the protein solution. The dosage of incident radiation that produces the radicals also requires careful optical alignment. Additionally, operation of the KrF laser commonly used for the procedure is potentially hazardous, so the procedure requires detailed attention to safety precautions. Sharp et al. here describe instrumentation that contributes to the safety and reproducibility of the methodology. It incorporates a fully enclosed UV illumination source for safety. It delivers a 10 µs flash of energy accurately focused on a capillary flow cell for reproducible, short-duration reactions. In a module downstream of the illuminator a UV photometric absorbance detector measures the change in absorbance of an internal standard that quantifies the dosage of radicals produced. This permits real-time adjustment to compensate for variation resulting from scavenging of free radicals. Sample is introduced to the system via a flow injector for reproducible sample delivery. This instrumentation is hoped to encourage wider use of photochemical foot-printing for protein analysis.

Mckenzie-Coe A A, Johnson D T, Peacock R B, Zhang Z, Jones L M. Evaluating the sulfate radical anion as a new reagent for in-cell fast photochemical oxidation of proteins. Journal of the American Society for Mass Spectrometry 32;2021:1644-1647.

Fast photochemical oxidation of proteins may be performed not only on purified proteins in solution but also on intact cells where proteins interact within their cellular environment. For this purpose, hydrogen peroxide may be introduced into the cells via aquaporin channels. Out of concern that hydrogen peroxide is itself involved in physiological intracellular reactions that may perturb cells, McKenzie et al. explore the replacement of hydroxyl radicals generated from hydrogen peroxide by sulfate radical anions generated by photochemical oxidation of sodium thiosulfate. The authors find that cell viability remains high at sodium thiosulfate concentrations up to 100 mM. Sulfate radicals modify a 1.5x greater number of proteins than hydroxyl radicals produced from hydrogen peroxide at the same concentration, while displaying a similarly broad reactivity profile. The results suggest that sulfate may provide an alternative to hydroxide for investigation of protein topography and interactions.

Wensien M, Von Pappenheim F R, Funk L-M, Kloskowski P, Curth U, Diederichsen U, Uranga J, Ye J, Fang P, Pan K-T, Urlaub H, Mata R A, Sautner V, Tittmann K. A lysine–cysteine redox switch with an NOS bridge regulates enzyme function. Nature 593;2021:460-464.

Wensien et al. describe a previously unrecognized type of covalent crosslink in proteins. It forms between the side chains of a cysteine and a lysine residue, creating an N–O–S bridge. While studying the pentose phosphate pathway enzyme transaldolase in Neisseria gonorrhoeae, the pathogen that causes gonorrhea, the authors discovered that the recombinant enzyme was catalytically inactive, but could be activated with a reducing agent. The protein has 3 cysteine residues, which suggested that formation of an inappropriate disulfide bridge may be responsible for the inactivity. However, when the cysteines were individually replaced, only one of them yielded an active enzyme, even though a pair of cysteines would be expected to have this effect. Recognizing that this result is inconsistent with a disulfide mechanism, the authors used X-ray diffraction to determine the 3-D structure of transaldolase crystallized under reducing and oxidizing conditions. The structure revealed the N–O–S bridge. They suggest that the bridge is formed when the lysine amine is oxidized by molecular oxygen or a reactive oxygen species to hydroxylamine or amine oxide, both of which are strong O-nucleophiles, and the cysteine thiol is oxidized either in a concerted or an independent reaction to sulfenic acid. The oxidized amino group would then react with the oxidized thiol to form a N–O–S bridge. In a search of the Protein Data Bank for structures with unexplained electron density between apposed lysine and cysteine residues, the authors identify likely N–O–S bridges in diverse proteins. They suggest that these bridges may function as redox switches for regulation of enzyme activity, and raise the possibility that the bridge could be engineered into proteins, or pharmaceuticals could be designed to target the bridge structure.

IMAGING

Chang B-J, Manton J D, Sapoznik E, Pohlkamp T, Terrones T S, Welf E S, Murali V S, Roudot P, Hake K, Whitehead L, York A G, Dean K M, Fiolka R. Real-time multi-angle projection imaging of biological dynamics. Nature Methods 18;2021:829-834.

Three-dimensional (3-D) images are usually constructed by serial acquisition of a stack of focal plane images. However, imaging rates could be made much faster by integrating the information in such a stack into a single raster scan. Chang et al. describe a simple scanning unit for constructing 3-D images. The unit converts any camera-based microscope with optical sectioning capability into a system that can integrate information from diverse viewing angles without rotating the sample or using multiple detection paths. This is done by optical shearing during acquisition of a single camera frame. Two galvanometric mirrors are placed in front of the camera sensor and are rotated at the same velocity while synchronously scanning the focal plane, thereby introducing shear during a single camera exposure. The resulting sheared projection image is equivalent to a projection image of a rotated specimen. The authors demonstrate this principle with several kinds of microscope and with diverse specimens. The rapidity of image acquisition enables recording of fast processes such as blebbing of cultured cells. The authors employ acquisition rates up to 119 Hz. They also use a beam splitter and a second camera to acquire orthogonal projections of a beating zebrafish heart with high spatial resolution.

Tian H, Sheraz Née Rabbani S, Vickerman J C, Winograd N. Multiomics imaging using high-energy water gas cluster ion beam secondary ion mass spectrometry [(H₂O)_n-GCIB-SIMS] of frozen-hydrated cells and tissue. Analytical Chemistry 93;2021:7808-7814.

Tian H, Sparvero L J, Anthonymuthu T S, Sun W-Y, Amoscato A A, He R-R, Bayır H, Kagan V E, Winograd N. Successive high-resolution (H₂O)_n-GCIB and C₆₀-SIMS imaging integrates multi-omics in different cell types in breast cancer tissue. Analytical Chemistry 93;2021:8143-8151.

Secondary ion mass spectrometry (SIMS), in which primary ions bombarding a surface desorb secondary ions for mass spectral analysis, has long provided an important approach to tissue imaging. The technique is attractive because of the high spatial resolution that may be achieved. However, SIMS is perpetually limited by the yield of secondary ions. A succession of improvements has been made by identifying ever better primary ions for the purpose. Tian et al. here explore the capabilities of water gas cluster ions as primary ions for biological imaging. The primary ion beam is formed similarly to the argon cluster beams originally used in SIMS. Steam from a boiler is allowed to undergo adiabatic expansion, which allows clusters of neutral water molecules to form upon cooling. These clusters pass through a collimating skimmer into a chamber where they are ionized by electron bombardment, whence they are filtered according to mass, accelerated, and focused for impact upon a tissue section. The authors show that cryogenic tissue preparation is optimal for the technique. No organic matrix is applied to the specimen surface. In their first paper, they use a 70 keV (H₂O)_28,000⁺ beam to image cultured cells with a beam spot-size of just 1 µm for very high resolution, and a 70 keV (H₂O)_31,000⁺ beam with spot size of 6 µm to image tissue sections. They observe secondary ion signal enhancement for selected lipid species of up to 200x relative to signals generated by a 70 keV (CO₂)_11,500⁺ beam formerly employed. The high energy beam enables high spatial resolution without loss of sensitivity. Desorption of a wide variety of endogenous molecules is recorded, including lipids, metabolites and peptides. In their second paper, Tian et al. perform lipidomic and metabolomic profiling of frozen-hydrated breast cancer tissue sections. They image first with a water cluster beam (spot size 1.6 µm). Then they apply cell-type specific lanthanide-conjugated antibodies and image the same tissue section using a C₆₀ ion beam (spot size 1.1 µm) to identify cell types. The distribution and intensities of >150 metabolites or lipids is recorded, demonstrating variation in tumor cells and infiltrating immune cells.

CELL BIOLOGY

Delorey T M, Ziegler C G K, Heimberg G, Normand R, Yang Y, Segerstolpe Å, Abbondanza D, Fleming S J, Subramanian A, Montoro D T, Jagadeesh K A, Dey K K, Sen P, Slyper M, Pita-Juárez Y H, Phillips D, Biermann J, Bloom-Ackermann Z, Barkas N, Ganna A, Gomez J, Melms J C, Katsyv I, Normandin E, Naderi P, Popov Y V, Raju S S, Niezen S, Tsai L T Y, Siddle K J, Sud M, Tran V M, Vellarikkal S K, Wang Y, Amir-Zilberstein L, Atri D S, Beechem J, Brook O R, Chen J, Divakar P, Dorceus P, Engreitz J M, Essene A, Fitzgerald D M, Fropf R, Gazal S, Gould J, Grzyb J, Harvey T, Hecht J, Hether T, Jané-Valbuena J, Leney-Greene M, Ma H, Mccabe C, Mcloughlin D E, Miller E M, Muus C, Niemi M, Padera R, Pan L, Pant D, Pe’er C, Pfiffner-Borges J, Pinto C J, Plaisted J, Reeves J, Ross M, Rudy M, Rueckert E H, Siciliano M, Sturm A, Todres E, Waghray A, Warren S, Zhang S, Zollinger D R, Cosimi L, Gupta R M, Hacohen N, Hibshoosh H, Hide W, Price A L, Rajagopal J, Tata P R, Riedel S, Szabo G, Tickle T L, Ellinor P T, Hung D, Sabeti P C, Novak R, Rogers R, Ingber D E, Jiang Z G, Juric D, Babadi M, Farhi S L, Izar B, Stone J R, Vlachos I S, Solomon I H, Ashenberg O, Porter C B M, Li B, Shalek A K, Villani A-C, Rozenblatt-Rosen O, Regev A. COVID-19 tissue atlases reveal SARS-CoV-2 pathology and cellular targets. Nature 595;2021:107-113.

Melms J C, Biermann J, Huang H, Wang Y, Nair A, Tagore S, Katsyv I, Rendeiro A F, Amin A D, Schapiro D, Frangieh C J, Luoma A M, Filliol A, Fang Y, Ravichandran H, Clausi M G, Alba G A, Rogava M, Chen S W, Ho P, Montoro D T, Kornberg A E, Han A S, Bakhoum M F, Anandasabapathy N, Suárez-Fariñas M, Bakhoum S F, Bram Y, Borczuk A, Guo X V, Lefkowitch J H, Marboe C, Lagana S M, Del Portillo A, Zorn E, Markowitz G S, Schwabe R F, Schwartz R E, Elemento O, Saqi A, Hibshoosh H, Que J, Izar B. A molecular single-cell lung atlas of lethal COVID-19. Nature 595;2021:114-119.

Standards for cellular description of pathologic changes and interpretation of these cellular changes in terms of pathophysiologic processes have been irrevocably altered by the widespread adoption of single cell transcriptomic analysis as performed by single cell RNA sequencing (scRNAseq) or single nucleus sequencing (snRNAseq). Two independent studies are chosen here to illustrate the use of these methodologies. Both describe the effects of COVID-19 infection on human tissues. COVID-19 may cause respiratory distress syndrome, sometimes with multiple organ failure. The two studies examine flash-frozen tissues derived at autopsy from donors who died of COVID-19. Both groups select snRNAseq in preference to scRNAseq. The two methods yield different measurements of the proportions of the cell types but snRNAseq is more suitable for handling hard-to-dissociate tissues and for processing frozen samples, so it decouples more effectively sample processing from the circumstances of tissue acquisition. Both groups perform broad categorization of cell types by automated, unbiased comparison with previously reported cell atlases, but use manual curation to refine the categorization for annotation of subclusters and cell states. In the lung, both groups note that comparison with control patient samples shows a reduction in alveolar epithelial cells of both the AT2 and AT1 types in COVID-19. AT2 cells serve as progenitors for AT1 cells in lung regeneration. The proportions and expression signatures of these cell types indicate that the regeneration program is initiated in patients who die of COVID-19 but the response is inadequate. There is a strong inflammatory response with dense infiltration by myeloid cells but an impaired T lymphocyte response. Gene expression profiles show the activity of these various cells types and indicate an aberrant pattern of myeloid cell activation compared with other viral or bacterial pneumonias. Fibroblasts are numerous and show phenotypes consistent with fibrosis. Delorey et al. also investigate the effects of COVID-19 on the heart, kidney and liver. They note increases in the proportion of vascular endothelial cells in the heart and identify multiple changes in transcriptional programs within cell types in these tissues indicating functional changes. Delorey et al. also test for the presence of SARS-CoV-2 RNA. They do not detect viral RNA in heart, liver or kidney. Viral RNA in the lung varies in a pattern consistent with higher levels of virus at chronologically earlier stages of infection. These datasets provide a wealth of information about tissue responses to SARS-CoV-2 that is hoped will prove useful in understanding not just COVID-19 mortality but also non-lethal but prolonged complications of the disease.

Spencer Chapman M, Ranzoni A M, Myers B, Williams N, Coorens T H H, Mitchell E, Butler T, Dawson K J, Hooks Y, Moore L, Nangalia J, Robinson P S, Yoshida K, Hook E, Campbell P J, Cvejic A. Lineage tracing of human development through somatic mutations. Nature 595;2021:85-90.

Spencer et al. make use of somatic mutations that occur during embryonic development, detected by whole-genome sequencing of single cells derived from hematopoietic stem and progenitor cells (HSPCs), to reconstruct a phylogenetic tree of fetal development. They collect hematopoietic organs from 2 human fetuses (8 and 18 weeks post-conception) with informed parental consent following termination of pregnancy. Individual cells are expanded and subjected to whole-genome sequencing to a depth sufficient for reliably calling single-nucleotide variants. Somatic mutations can be used as barcodes for lineage tracing. By examining such mutants, and coupling the data with deep, targeted sequencing to determine the distribution of the same mutations in tissues of known embryonic origin, the authors are able to infer the timing and divergence of HSPCs, and estimate the number of cells from which HSPCs arise at different stages of embryonic development. The data indicate a remarkably high burden of somatic mutations: 25.5 single-nucleotide substitutions per HSPC at 8 weeks and 41.9 at 18 weeks; 2.09 indels at 8 weeks and 2.13 at 18 weeks. The mutation rate is especially high during the first 3 cell divisions of the zygote, with a mean of 2.4 mutations per division acquired in each daughter cell. However, the mutation rate is much lower thereafter. This study shows how unresolved questions about the developmental origins of human tissues may be illuminated in the future by study of somatic mutations as naturally occurring barcodes.