This column highlights recently published articles that are of interest to the readership of this publication.
This column highlights recently published articles that are of interest to the readership of this publication. We encourage ABRF members to forward information on articles they feel are important and useful to Clive Slaughter, AU-UGA Medical Partnership, 1425 Prince Avenue, Athens GA 30606. Tel; (706) 713-2216: Fax; (706) 713-2221: Email; [email protected] or to any member of the editorial board. Article summaries reflect the reviewer’s opinions and not necessarily those of the Association.
Given the increase in usage of long-read RNA sequencing technology, a benchmarking study here evaluates the effectiveness of long-read methods in fulfilling the various goals of transcriptome analysis. The study compares different cDNA preparation methods, including cDNA kits, rolling circle amplification to concatemeric consensus (R2C2) for increased sequence accuracy, CapTrap for enrichment of 5’-capped RNAs, and direct RNA sequencing. The sequencing platforms evaluated are Pacific Biosciences (PacBio) Sequel II and Oxford Nanopore Technologies (ONT) MinION. The data are also compared with short-read sequencing. cDNA-PacBio and R2C2-ONT yield the longest read-lengths; CapTrap-PacBio, cDNA-PacBio and R2C2-ONT produce the highest sequence quality (Q score); and CapTrap-ONT and cDNA-ONT provide the largest numbers of reads. Greater read length produces better structural accuracy in the specification of start/termination sites and junctions. In the quantification of transcript abundance, wide variation between laboratories is observed in the number of structurally specified transcripts, and in their quantification. Little overlap is noted between the transcripts defined by any two pipelines during quantification studies. However, greater read depth produces better quantification accuracy. De novo transcript identification in the absence of a well annotated reference genome sequence remains problematic.
Nanda et al. seek to broaden application of Pacific Biosciences (PacBio) single-molecule DNA sequencing by reducing the amount of DNA needed for library construction when performed without PCR to avoid loss of modified bases and possible introduction of biased sampling. At present, 1-5 µg of DNA is needed for such library construction, corresponding to 150,000-750,000 cells. The authors deploy Tn5 for simultaneous template fragmentation and transposition (tagmentation) with hairpin adaptors to produce transposomes of extended, tunable length (> 1 kb) for circularization via optimized gap repair. These transposomes are used for single-molecule sequencing. The authors achieve detection of genetic variation and CpG methylation with as little as 40 ng of DNA (7,000 cell equivalents) with accuracy comparable to conventional whole-genome and bisulfite sequencing. They further adapt the methodology for mapping chromatin structure by footprinting with a non-specific adenine methyltransferase. With this technique, they investigate CTCF and nucleosome occupancy, and CpG methylation in DNA from samples of 30,000-50,000 cells to reveal signatures in metastatic cancer cells.
Enzyme-catalyzed glycosylation can be accomplished in a highly regioselective and stereoselective manner without the need for protecting the many hydroxyl groups from unintended reaction. But non-enzymic synthesis generally requires multi-step sequences of hydroxyl group protection, functionalization of the reactive center with anomeric leaving groups, and deprotection. Jiang et al. now describe a simplified method of chemical glycosylation for convenient formation of glycomimetics such as C-glycosides, S-glycosides and Se-glycosides. They synthesize C-glycosides in a metal- and protecting group-free process. The greater acidity of the anomeric hydroxyl compared to the other hydroxyls in a monosaccharide permits selective formation of a 2,3,5,6-tetrafluoropyridine-4-thioglycoside derivative with 2-chloro-1,3-dimethylimidazolinium chloride under mildly basic conditions. This is termed a “capping” reaction. The thioglycoside is then reacted with acrylate to undergo desulfurative C–C coupling under blue LED irradiation at ambient temperature. This glycosylation reaction provides a stereochemically pure (β) product. The scheme works with a broad variety of mono- and oligo-saccharides, and also successfully forms S-glycosides and Se-glycosides. The authors also show coupling of D-mannose, D-galactose and N-acetylglucosamine to dehydroalanine residues tagged to proteins, thereby demonstrating applicability of the scheme to synthesis of C-glycosylproteins analogous to biological O- and N-glycoproteins.
Keener et al. provide optimized mass spectrometric protocols for analysis of a challenging family of molecules, the lipopolysaccharides and lipooligosaccharides of Gram-negative bacteria. These molecules are components of the bacterial outer membrane. Their structures are of key biomedical interest as determinants of bacterial stability, toxicity and pathogenesis. The molecules are comprised of three regions: a strain-specific outer glycan called the O antigen, a less variable inner core oligosaccharide, and a lipid A moiety that consists of a phosphorylated glucosamine disaccharide decorated with several fatty acids that anchor the molecule to the bacterial membrane. The present authors seek to avoid the need for chemical derivatization commonly used in analysis of the structure, which, though convenient, results in loss of information. They employ a multistage activation method termed activated-electron photodetachment. Negatively charged ions in the gas phase are separated in an MS1 stage, then subjected to ultraviolet photoactivation to generate charge-reduced radical species. The primary charge-reduced precursor ion is then isolated in an MS2 stage and subjected to collision-induced dissociation (CID) to generate informative product ions analyzed in an MS3 stage. These manipulations may be preceded by ultrahigh performance liquid chromatography (UHPLC) for prior resolution of analyte mixtures. The mass spectrometric protocol yields abundant glycosidic and cross-ring cleavages for structural characterization of the glycan portion of the molecule, and facilitates determination of glycan branching patterns and localization of sub-stoichiometric modifications. The authors use the procedure to elucidate interesting structural features of the lipopolysaccharide from the human gastrointestinal bacterium Bacteroides fragilis.
When liquid undergoes pressure-driven, laminar flow through a capillary tube, its linear velocity decreases radially from the tube center to the tube walls because of frictional drag. A pulse of solute injected into the tube therefore becomes parabolic in shape in the axial direction, tending to disperse the pulse. At the tube wall, therefore, the solute concentration is reduced at the leading edge of the pulse, and elevated at the trailing edge. Diffusion of solute counteracts these radial concentration gradients, tending to maintain compactness of the solute pulse in the axial direction. These opposing effects result in axial spreading of the solute pulse with solute concentration rising at the leading edge and falling at the trailing edge – a phenomenon known as Taylor dispersion. Taylor dispersion is greatest for solutes with small diffusion coefficients, and least for solutes with large diffusion coefficients. Consequently, at the leading and trailing edges of the pulse, partial separation of solutes with different molecular sizes occurs, based on their different diffusion coefficients, a phenomenon known as Taylor-Aris dispersion. Szabo et al. use Taylor-Aris dispersion as the basis for acquisition of electrospray (ESI) spectra of proteins or protein complexes submitted for analysis in solutions containing insoluble electrolytes (Tris buffer, NaCl, etc.) that would impair their detection by signal suppression, exacerbation of baseline noise, and/or signal splitting resulting from salt adduction. The authors demonstrate strong ESI spectra for proteins derived from the leading edge, and sometimes also the trailing edge of a sample transported through a fused silica capillary tube by NH4OAc solvent, while signal from the protein in the center of the pulse is undetectable.
Sub-microliter droplets created within a microfluidic environment provide enormous numbers of assay volumes for high-throughput, massively parallel analysis and synthesis. An enduring challenge is how to change the content of such droplets in a simple and quantitatively precise way at key times after they have been formed, and to do so without resorting to merging or splitting operations requiring ancillary steps. Krishnamurthy et al. contribute new methodology for this purpose. They produce an emulsion of nanoliter-scale water-in-oil droplets in a conventional way by shearing at a T-junction in which oil and water streams mix. The droplets are then passed through a central channel that is flanked by two electrolyte-filled auxiliary channels. The central channel communicates with the auxiliary channels on one side via an anion exchange membrane and on the other side via a cation exchange membrane. Droplets traveling through the central channel contact the two membranes. A power supply maintains an electrical potential difference between the two auxiliary channels to induce ion transport across the membranes. To deplete (desalt) the droplets of electrolytes, the anion exchange membrane connects to the anodic auxiliary channel and the cation exchange membrane connects to the cathodic channel. To introduce ions into the droplets, the electrodes are reversed. This electrokinetic device achieves high desalting efficiency. Rates of ion transport can be adjusted by varying the voltage to achieve desired ionic strength in the droplets. The technique is anticipated to provide a broadly applicable way to accomplish precise changes in droplet composition on demand.
The work of two groups heralds new capabilities for facile, programmable insertion, excision and inversion of kilobase-scale DNA sequences in genome editing. They envision sequence manipulations mediated by enzymes derived from recombinases. Recombinases are used by mobile genetic elements for their transposition. Recombinases suitable for genome engineering would rely upon programmable, non-coding RNA to determine site specificity on the basis of sequence at the donor and target DNA sites. Durrant et al. show that the IS110 family of insertion sequences (mobile genetic elements) encode a small (300-460 amino acid) recombinase enzyme, and an untranslated RNA of 150-250 nucleotides. The untranslated RNA includes two distinct loops, one of which binds the donor DNA and the other binds the target DNA. In an accompanying paper, the group describes structural studies showing that each loop interacts with a recombinase dimer. The active complex therefore consists of a recombinase tetramer, an RNA guide (which Durrant et al. term a bridge RNA and Siddiquee et al. term a seeker RNA), and donor and target DNA strands. A composite catalytic center on the recombinase spans the two dimers. The protein positions both the donor and the target DNAs to interact with this catalytic center. The distinct RNA loops that interact with the donor and target sequences provide a very convenient means for programming specificities for the desired sequences, and overcome the more complex programming task necessitated by previously known recombinases, which rely on complex protein-DNA interactions to determine their sequence specificity. The findings of Durrant et al. are confirmed by the work of Siddiquee et al. The two groups presently demonstrate the capabilities for sequence editing only for in vitro DNA and for cells of Escherichia coli. However, it is hoped that, with further engineering, the system may be adapted to work in mammalian cells too.
Because our understanding of cell/tissue biology and pathology relies increasingly on high-throughput analysis of molecular species at the single-cell level, insight into systematic biases introduced by the different methodologies available for single-cell analysis becomes increasingly important. De Rop et al. perform multicenter benchmarking of methods in single-cell assay for transposase-accessible chromatin by sequencing (scATAC-seq), a technology used to identify genomic regions involved in gene regulation. The authors assess eight different scATAC-seq protocols, including commercial methods and variants thereof. They incidentally establish databases and methods for dataset comparison that will be useful to investigators in the future. The data indicate broad agreement between methods in identification of cell types, but large differences are noted in sequencing library quality and tagmentation specificity for open chromatin sites. These differences affect annotation of cell types, identification of differentially accessible chromatin regions, and enrichment of transcription factor motifs. Differences in the economics of experiments are also noted. Investigators will wish to refer to the results of this benchmark study in making choices of methodology, always bearing in mind that performance may differ between sample and tissue types from those used in this particular work.
Lin et al. contribute streamlined methodology for identifying dynamic protein-protein interactions at the cell surface. The methodology is based upon photocatalytic proximity labeling followed by proteomic identification of interacting partners. The spatial resolution of such methods depends upon the half-life of the reactive intermediates employed for labeling. Interactions over shorter distances are detected by probes with shorter half-life. A variety of probes is available. Lin et al. provide a single method by which a range of such probes can be used to achieve adjustable resolution. For this purpose, they employ a single photocatalyst for activation of all the probes - the commercially available dye Eosin Y (also used in cytologic stains such as Hematoxylin & Eosin). When illuminated with light of suitable, biocompatible wavelength, the authors show that Eosin Y activates an diazirine-biotin conjugate to provide a short-lived probe to detect short-range interactions (∼100 Å), aryl-azide-biotin provides a medium-lived activated probe for medium-range interactions, and phenol-biotin provides a long-lived activated probe for long-range interactions. Interacting proteins are captured via the biotin tag. Eosin Y is conjugated to an antibody with specificity for the protein whose interactions are being investigated. The authors validate the system in a study of the epidermal growth factor receptor (EGFR). In the course of this study, they also utilize the prediction of binary interactions between proteins with AlphaFold-Multimer to validate the associations they detect and elucidate their functional consequences.
This paper presents three distinct technical developments that enhance capabilities for tissue imaging: an improved device for tissue sectioning, a chemical technique for hydrogel formation, and a computational tool for analysis of cellular connectivity. The authors integrate these three developments into a single platform for investigation of neural circuitry in the human brain and for investigation of pathophysiologic changes in neurodegenerative disease. For tissue sectioning, a new vibratome is developed. Conventional vibratomes are unable to section across entire organs the size of a human brain. Additionally, out-of-plane blade vibration results in abrasion, tearing and deformation of cut surfaces. The new device provides high-frequency blade vibration of increased amplitude and diminished out-of-plane vibration, enabling sectioning of samples from organoids to intact human brain. For tissue processing, the capabilities of methods previously described by the group are combined to transform tissues into an elastic, reversibly expandable tissue-hydrogel hybrid that preserves tissue architecture, native biomolecular structure, and antigenicity, and permits multiple rounds of relabeling. For analysis of connectivity, inter-slab image registration is accomplished by a semi-automated computational method that matches fluorescently labeled blood vessels. The method matches corresponding neural fibers with the low average error of 1.29 µm. These methods, deployed either individually or together in an integrated platform, are anticipated to contribute to fine-scale tissue mapping over the extended distances spanning whole human organs.
In this pre-publication article posted prior to peer review, Bai et al. highlight an unexpectedly high incidence of functionally significant errors in plasmids that are constructed in academic and industrial laboratories and then submitted for dissemination to the scientific community for use as gene delivery vectors. The authors’ company is a provider of gene delivery tools, and encounters the problem of high error rates during standard quality control of plasmids submitted to them by clients seeking the commercial services they provide. Using restriction mapping and/or Sanger sequencing, the authors routinely validate plasmid structural information provided to them by senders before further processing the plasmids. The authors report that about 15% of plasmids contain more or less subtle design errors that could affect function, and 35% contain sequence errors. Forty percent of plasmids used for making adeno-associated virus (AAV) vectors carry mutations in the inverted terminal repeat (ITR) sequences required for packaging into the vector. The authors ascribe this particular problem to instability resulting from regions of high GC content within these sequences. These findings indicate a need for heightened awareness of design requirements for plasmids. The findings also indicate a need for deployment of quality control of plasmids shared between laboratories to avoid preventable problems encountered in plasmid use.
DNA-encoded chemical libraries (DELs) offer an increasingly productive approach to affinity selection in small-molecule drug discovery. They are prepared by combinatorial chemical synthesis. After each synthesis reaction, ligation of a DNA sequence encoding the added chemical building block is incorporated into an attached DNA strand. The cycle is repeated in multiple split-and-pool rounds of synthesis. Success of the method may be limited if suboptimal addition of chemical building blocks results in the production of truncated structures, especially since the presence of truncations in the chemical library is not reflected in a decrease in occurrence of the corresponding DNA barcodes in this methodology. The problem of truncations is a substantial concern in fields such as the synthesis of macrocyclic structures. Keller et al. present an ingenious bead-based synthesis method to overcome this limitation. The growing construct is attached to magnetic beads by a linker, and capping of unreacted structures is employed where possible. A second linker is attached to the beads that reacts with the terminal chemical building block, but not with capped structures. When the first linker is cleaved, all truncated structures are released and can be washed away. The second linker is then cleaved to release the desired product as a self-purified species. The purity of DELs enhanced in this way enables better target affinity selection and broadens the scope of chemical reactions that may be used for library construction.
Analysis of structure-activity relationships of drug candidates is an integral process in drug discovery. It often requires synthesis of molecular analogs in which key atoms are selectively inserted, deleted or replaced in the candidate structure. For this purpose, changes to peripheral chemical groups attached to the molecular skeleton is a straight forwardly standard approach. However, changing atoms in the skeleton itself is generally much more difficult. This usually requires a completely new, multi-step synthesis. Uhlenbruck et al. here describe a facile reaction scheme for skeletal editing in the case of one commonly encountered chemical structure in drug discovery, the pyrimidine ring. The authors’ scheme involves conversion of a pyridinium ring to an N-arylpyridinium salt. This is then subjected to ring opening by an ethanolic solution of piperidine to form a 3-carbon iminoenamine. Iminoenamines are key building blocks for de novo heterocycle synthesis, and can be converted directly into other heteroaromatic ring systems, including imidazoles, oxazolones and pyridines. In this way, a complex, heterocyclic molecule is deconstructed and then reconstructed on a different skeleton. The scheme is expected to find immediate application in drug development, and may additionally be extendable to other classes of heterocycles.
Although deep learning models such as AlphaFold2 (AF2) and RoseTTAFold provide predictions of protein structure that accord very accurately with experimentally derived structures, the use of AF2 structures in docking studies has not been found to provide comparable accuracy in describing the binding of proteins to known ligands. This has led to concern that AF2 structures may be of limited value for drug discovery. Lyu et al. reexamine this conundrum. They draw a distinction between retrospective studies of the kind hitherto used to examine the issue, and prospective studies that seek to identify new ligands for target proteins. They perform a prospective study with two selected therapeutic targets: the σ2 receptor, an intrinsic membrane protein of endoplasmic reticulum involved in cholesterol homeostasis, and the serotonin 2A (5-HT2A) receptor, a G protein-coupled receptor that mediates the effects of serotonergic psychedelic drugs. The authors compare the performance of AF2 structures and cryo-electron microscopy (cryo-EM) structures in prospective ligand discovery for these two proteins, assessed in terms of hit rate (defined as the number of experimentally active ligands identified per the number tested after being identified as putative ligands), and hit potency (Ki value or EC50). By these outcome measures, AF2 structures and cryo-EM structures behave remarkably similarly for these two proteins. Lyu et al. explain the discrepancy between these and previous results in terms of bias introduced when previously known ligands affect the conformation of target proteins determined experimentally in complex with them, and in turn influence the search for new ligands. They acknowledge that, unlike the AF2 structures for these two proteins, some AF2 structures turn out to be very discrepant from experimental structures. Better methods for distinguishing AF2 structures unsuitable for docking studies are needed. Also, the present study does not use ligand information. However, tools such as RoseTTAFold All-Atom and AlphaFold Latest now enable proteins to be co-folded with small molecules, and are anticipated to improve models for library searching. The authors conclude that AF2 models represent alternate, low-energy conformations that can guide ligand discovery as effectively as experimental structures do.