Skip to main content
SearchLoginLogin or Signup

Microbiome and Microbial Profiling of Arctic Snow Using Whole Genome Sequencing, Psychrophilic Culturing, and Novel Sampling Techniques

Keywords: Microbiome, Greenland, Snow Microbiome, Arctic microbiome, microbial profiling, psychrophilic culturing

Published onMar 24, 2025
Microbiome and Microbial Profiling of Arctic Snow Using Whole Genome Sequencing, Psychrophilic Culturing, and Novel Sampling Techniques
·

Abstract

Recent advances in massively parallel DNA sequencing have enabled researchers to study new areas of extreme environments. Of particular interest to many researchers are areas of the Arctic that have yet to be comprehensively examined using DNA techniques. These modern approaches to microbial profiling provide new critical data on systems biology not yet seen before from Arctic samples. The discovery of new microbes, microbial biochemical pathways, and biosynthetic gene clusters are critically important when characterizing the Arctic snow microbiome and can provide insights to discovering valuable biosynthetic gene clusters. In this study, 2 L of snow was collected from 15 sites 12 km east outside of Ilulissat, Greenland, using DNA-free sterile techniques. Snow was allowed to melt and immediately concentrated using the InnovaPrep CP sample concentrator. Whole genome DNA sequencing was performed on extracts using both Illumina and Nanopore sequencing as well as psychrophilic culturing. Individual cultures were also sequenced to determine whole genome content and species identity. The results showed a wide-ranging microbiome across the snow fields, including bacteria, yeast, and fungi, with Granulicella, Methylobabcterium, Nostoc, Sphingomonas, and Streptomyces being consistently detected at higher levels across the majority of sites and sequencing platforms, while Belnapia, Chlorogloea, Hymenobacter, Mesorhizobium, Narcardioides, Pseudomonas, Pseudonocardia, Roseomonas, and Solirubrobacter at comparatively lower abundances. The results of culture data for snow sites reveal Pseudomanas sp., Pseudomonas fluorescens Group, unknown Microbacteriaceae sp., Variovorax sp., Robbsia andropogonis, and low concentrations of Aureobasidium sp., Stylodothis sp., Sphingomonas sp., Hymenobacter sp., Caballeronia sordidicola, and two unknown species of yeast and one unknown species of bacteria.

Address correspondence to: Scott W. Tighe, Vermont Integrative Genomics Resource, University of Vermont Larner College of Medicine, Burlington, Vermont, USA, 05405(Email: [email protected]; Phone: 802-656-2482).

Conflict of Interest Disclosures: The authors declare no conflicts of interest.

INTRODUCTION

Microbial research in the Arctic has long interested polar scientists. However, until recent years, molecular methods have not provided the technology to comprehensively study samples from these areas to fully characterize the microbiome. Previous methods were limited to culturing, polymerase chain reaction (PCR), denaturing gradient gel electrophoresis, and capillary sequencing to characterize the microbial profile of these ecologies. These methods lacked the granularity needed to obtain the comprehensive picture of high precision taxonomy, metabolic pathways, and the detection of previously unknown and unculturable microbes. Recent advances in massively parallel sequencing technologies have enabled new approaches for studying extreme environments and the microbial ecology of Arctic environments.[1],[2] The importance of these new data is without question, since the microbiology of these environments contains valuable information that has been previously unobtainable. The detection and description of new microorganisms through metagenomic DNA sequencing allows for the description of new species through metagenomically assembled genomes (MAGs),[3],[4] thus providing insight to the unculturable microbial ecosystem, or “dark matter.” Metagenomics provides data on new biochemical pathways, microbial interactions,[5],[6] new biosynthetic gene clusters and secondary metabolites, and genetic response elements including tracking horizontal gene transfer between microorganisms and viral interactions through MetaHiC analysis.[7] In addition to metagenomic DNA-based technologies, RNA sequencing for community transcriptomics analysis (metatranscriptomics) is an evolving technology that uses both complementary DNA–based[8],[9] and direct RNA sequencing[10] to describe the microbial community transcriptomes and transcriptome interactions such as those seen in Antarctica.[11] While metagenomic and metatranscriptomic sequencing are remarkable advancements in Arctic research, sample processing and nucleic acid extractions have also advanced. The latest InnovaPrep CP concentration system[12] improves sample concentration for liquid samples; new specialty enzymes, such as MetaPolyzyme,[2] facilitate increased microbial lysis of environmental samples. Most importantly, the development of DNA-free reagents reduces the DNA background caused by reagent contamination (ie, “kitome”).[13] This is especially important for ultra-low biomass Arctic samples such as snow and glacial waters.[14]

In this research, Arctic snow was of particular interest since many of the new technologies were implemented for the first time to comprehensively evaluate the snow microbiome with a focus on metagenomics and microbiome. DNA-free collection of Arctic snow, high-volume sample concentration techniques, and high-resolution metagenome DNA sequencing using both Illumina and Oxford Nanopore Technologies (Nanopore, hereafter) were employed along with traditional microbiological culturing techniques.

For this study, snow was collected from a remote undisturbed snow field east of Ilulissat, Greenland, approximately 3° north of the former Distance Early Warning (DEW) line at a location free of human and animal activities for the purpose of recovering novel snow microbiomes and metagenomes that reflected a true Arctic snow environment. These data combined with psychrophilic culturing will contribute to a comprehensive catalog of the snow microbiome for taxonomy, functional genomics, and potentially from a novel biosynthetic gene cluster standpoint.

MATERIALS AND METHODS

Sample collection

The sample site selection for a large area snow field with significant snow coverage was determined using satellite data from the Earth Observing System Data and Information System Terra/MODIS (worldview.earthdata.nasa.gov), Greenland Ecosystem Monitoring database (https://data.g-e-m.dk/ Aarhus University), and expert guidance from a local field guide.

Three sites were selected for snow collection (Fig. 1); one for primary microbial geomapping located 14 km east of Ilulissat, Greenland (Site G), one from an unrelated snow field near Lake Akinnauq located 7 km west of the primary site (Site A), and the third from the University of Vermont campus. Sampling was performed at the primary geomapping site in a 0.5-km grid pattern. At 2 sampling points, site 10 and 12, samples contained minor concentrations of soil that was expected to contain a representative microbial diversity. Sample 10D was collected as a mixed soil reference at the snow soil interface (~1m depth). The control site near Lake Akinnauq was collected using the same techniques directly following the primary sampling. One snow sample from the University of Vermont campus was collected for methods development and to represent a North American suburban sample.

Figure 1

Sampling location of research study. (Clockwise left to right) 1. Map of Greenland and location of Ilulissat. 2. Location of Lake Akinnauq (site A) and the microbial geomapping site (site G). 3. Microbial geomapping site in a 0.5 km2 grid pattern with sample locations 1, 4, 7, and 10 numbered. 

Snow samples were aseptically collected using DNA-free sampling techniques in duplicate from a depth of 18-90 cm below the snow surface using sterile 2-L Whirl-Pak bags (B01451 Whirl-Pak Inc., Wilmington, Delaware), sterile plastic disposable scoops, and sterile gloves. New gloves and scoops were used at each sampling site. Snow temperature, quality, GPS coordinates, and time were recorded for each site. Coordinates were logged in real time using a Garmin inReach Mini. A total of 30 samples were collected at 900 ft above sea level, except for the Lake Akinnauq samples, which were at 550 ft above sea level. All samples were stored and transported using a snowmobile equipped with a sled trailer to maintain snow in the dark at ambient temperature. All metadata associated with the samples is available in Table 1.

Table 1

Ilulissat sample collection metadata. Observed snow condition indicates clean, minor debris content, or soil contamination. Snow temperature was measured at exact location of sample collection. Numbered sample sites are from site G.

Site

Observed snow condition

Elev. (ft)

Latitude

Longitude

Temp (°C)

Sampling depth (cm)

Comments

1

Clean

905

69.22457

-50.8145

-5.5

20

Hard pack fine grain

2

Clean

897

69.22478

-50.8115

-5.7

28

Hard pack fine grain

3

Clean

897

69.22492

-50.80868

-5.5

20-30

Hard pack fine grain

4

Clean

906

69.22608

-50.8023

-5.5

20-30

Hard pack fine grain

5

Clean

896

69.22327

-50.80152

-5.5

30

Hard pack fine grain

6

Clean

896

69.22169

-50.80137

ND

18

Shallow hard snow 7"

7

Clean

898

69.21914

-50.80231

ND

30

Soft snow gradient layer

8

Clean

895

69.22034

-50.80688

ND

22-35

Near ledges

9

Clean

903

69.22047

-50.81119

ND

30

Hard pack fine grain

10

Mixed

904

69.22009

-50.81541

-3.6

30

Large granular

10D

Soil

904

69.22009

-50.81541

1.4

90

Ground level with soil

11

Clean

895

69.22211

-50.81472

ND

45

Light pack granular

12

Mixed

904

69.22324

-50.81501

-4.1

20-30

Near edge of rock ledge

13

Clean

903

69.22279

-50.80964

ND

28

Hard pack fine grain

14

Clean

890

69.22303

-50.80493

-3.2

30

Hard pack fine grain

Site A

Clean

551

69.22604

-50.9564

-4.2

20-30

Medium granular—near edge of lake

Sample processing and extraction

All snow samples were thawed at room temperature in their respective Whirl-Pak bags in an upright position overnight at 13.1 °C, with occasional shifting to allow for equal thaw rates and temperature maintenance. Prior to sample concentration using the InnovaPrep CP instrument (InnovaPrep Corp, Drexel, Missouri), two aliquots were aseptically removed and used for 1) pH testing (ColorpHast EMD #1.09535.0007, pH 0 to 14) and 2) archiving in 2-mL screw capped tubes. Liquid sample concentration was performed directly from the Whirl-Pak sample bag using an InnovaPrep CP instrument equipped with 0.2-µm hollow fiber concentrating tips (CC08022-10 InnovaPrep Corp) for a total volume between 600 and 1300 mL (Table 2). Each sample was eluted from the hollow fiber tip using 600 µL of manufacturer supplied, pressurized, sterile phosphate-buffered saline (PBS). Two negative control samples were collected using a new hollow fiber concentrating tip without any liquid processing. Final elution volumes were volumetrically measured with a P1000 micropipette. Of the concentrated sample, 100 µL was transferred to a separate sterile tube reserved for culturing, while the remaining elution was preserved for DNA analysis by adding 500 µL of 100% ethanol as a temporary preservative and placed in tube storage boxes along with ice packs. Samples were transported back to Vermont within 48 hours as carry-on baggage under a valid U.S. Department of Agriculture Animal and Plant Health Inspection Service permit and Greenland license G22-009 for nonexclusive use of genetic materials. Samples were stored at 4o C until processing within 96 hours.

Table 2

Input and elution volumes for samples processed using the Innovaprep™ concentrating pipet (ICP) sample concentrator. Samples were concentrated using standard instrument settings and processing time was collected. Sample numbers correspond to site location numbers from site G, while sample Lake A corresponds to the sample from site A, at Lake Akinnauq.

Sample

Vol concentrated
(mL)

Filtration time (min:sec)

Eluted vol from InnovaPrep (uL)

Rep 1

Rep 2

Rep 1

Rep 2

Rep 1

Rep 2

1

900

800

5:25

5:35

570

700

2

700

600

4:27

4:15

660

570

3

1000

900

5:28

5:36

510

510

4

1100

800

6:09

5:45

600

600

5

800

1000

5:15

6:44

570

530

6

1100

900

7:00

5:58

710

660

7

900

1200

12:11

7:02

750

720

8

1300

850

6:29

5:27

870

1000

9

900

900

5:56

6:05

1100

1000

10

800

1100

7:29

7:26

560

660

11

800

1000

5:50

6:30

590

540

12

800

700

7:23

7:39

820

670

13

600

1200

5:12

5:43

500

770

14

700

1100

5:41

5:53

1000

650

Lake A (15)

950

1300

6:13

7:12

750

580

DNA extraction from snow concentrates

DNA extractions were performed on ethanol-preserved samples by first washing with 1x PBS (without Ca++/ Mg++) for a total of 2 washes, with centrifugation and pelleting at 1,500x g for 10 minutes. This included a “trip” blank as an additional negative control. Each sample concentrate was then extracted using a combination of enzymatic cell wall digestion with MetaPolyzyme (MAC4L, MilliporeSigma, St. Louis, Missouri) and bead beating with Matrix A beads (MP Biomedicals, Irvine, CA) for 8 cycles at 30 second intervals at 4000 reps in the presence of MagZorb lysis buffer. The purification of DNA was performed using the MagZorb magnetic DNA extraction and purification kit (MB1004, Promega Corp Madison, WI). Samples were eluted in 50 µL of 5-mM TRIS and quantified using the Qubit spectrofluorometer (Thermo Fisher, Waltham, MA) with the dsDNA HS Kit (Thermo Fisher Q33230).

DNA LIBRARY AND HIGH-THROUGHPUT SEQUENCING

Purified DNA was used to prepare both Illumina and Nanopore sequencing libraries. Illumina libraries were prepared using the Illumina Nextera XT Library Prep kit according to the manufacturer’s protocol (FC-131-1096, San Diego, California). Samples with less than the required 1 ng of DNA input were added at the maximum volume of 5 µL and processed using the standard method. All samples were processed in 3 groups based on Qubit concentrations; group 1 was >0.40 ng/µL and amplified with 12 cycles of PCR, group 2 contained samples between 0.10 to 0.29 ng/µL and was amplified with 14 cycles of PCR, and group 3 contained samples with <0.10 ng/µL and were also amplified with 14 cycles of PCR. Duplicate PBS and no-template process controls were processed simultaneously. Final libraries were eluted in 40 µL of 10-mM TRIS and quality was determined using the Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA) and Qubit spectrofluorometer. Final DNA concentrates ranged from 0.34 to 24.3 ng/µL. Pooling of libraries was performed based on the original input concentration to the library synthesis as follows: 10 ng pooled for samples starting with 1-ng input, 5.0 ng for samples with <1-ng input, and 0.5 ng of library for the 4 control samples (Supplementary Fig. 1). This strategy was implemented to help mitigate over-representation of low biomass samples in the final sequencing data. Pooled libraries were assessed for quality using the Bioanalyzer 2100 HS DNA Kit (5067-4626, Agilent Technologies) and Qubit HS DNA Assay. Sequencing was performed on a full dual lane paired-end rapid run flow cell in 2 x 130 bp configuration on an Illumina HiSeq1500/2500 for a total of 86 Gb from 330 million clusters.

Nanopore libraries were prepared using the direct DNA ligation kit with native barcoding (SQK-LSK109 and SQK-NDB114, Oxford Nanopore Technologies Oxford, UK). Input volumes were adjusted to maximum volume input since the required total DNA input was below the recommended 120 ng. Samples were pooled based on equal volume and analyzed on 4 flow cells (FLO-MIN106 Rev 9.4.1) using a GridION MK1 X5 Sequencer.

Culturing, sequencing, and identification

Culturing was performed within 96 hours of sample receipt on both the InnovaPrep CP concentrated snow extract samples and unconcentrated aliquots by aseptically plating 10, 50, and 200 µL using the spread plate technique on R2A agar and incubated for 30 days at 4 °C. Colonies were enumerated, characterized, and sub-cultured to new R2A agar plates for purification and DNA analysis. Pure cultures were preserved using 15% glycerol for long-term storage. Samples without growth are not reported. Forty-three isolates were recovered from the samples, including identical morphotypes between some samples.

The identification of the pure psychrophilic microbial isolates was performed on loop harvested biomass by rapid DNA extraction using a pre-digestion step with MetaPolyzyme (MilliporeSigma, St. Louis, Missouri) for 4 hours followed by DNA extraction with the Qiagen DNA Power Soil Kit (47016, Hilden, Germany) and quantification using the Qubit spectrofluorometer. Whole genome sequencing was performed on each isolate using the Nextera XT library kit as outlined above. Libraries were sequenced using the Illumina MiSeq sequencer equipped with a 300 cycle V2 Nano flow cell (MS-103-1001) for a total of 1 million reads (~7 mb per isolate from the 43 isolates) and analyzed with One Codex software (onecodex.com) and confirmed by comparison to the National Center for Biotechnology Information Nucleotide database using Basic Local Alignment Search Tool (BLAST)[15] (https://blast.ncbi.nlm.nih.gov/Blast).

Data processing

Reads were merged across sequencing lanes and concatenated within the sample site. Sample data were classified using the One Codex software and further processed using custom Python scripts. Reads were filtered to those found to match either Bacterial or Archaeal organisms. Reads that could be classified to genus or species were considered for further analysis. Any genera found in the associated negative controls at a similar concentration were subtracted from the sample data, which can result in a subtraction of true-positive genera. This is an important consideration in ultra-low biomass samples in which reagent contamination can be over-represented. The remaining read counts were normalized as a proportion of the total bacterial and archaeal genera read counts with a proportion ≥1% cutoff. Graphs were generated using Altair plots. Principal component analysis (PCA) plots were generated using genus-level classification and the trimmed mean of M-values to account for compositional biases, with features contributing equally, to visualize samples in three-dimensional space based on shared similarity (Partek Flow, St. Louis, Missouri).

RESULTS

DNA extraction and recovery

DNA was extracted and quantified as outlined above. DNA recoveries varied based on site (Table 3). Not surprisingly, sample sites 10 and 12 had the highest DNA recovery since they contained small amounts of soil debris, while the snow samples 1, 2, 3, 5, 11, 13, and 14 had the lowest along with the negative controls. Sites 4, 7, and 8, had moderate levels of DNA detected.

Table 3

DNA recovery from each replicate snow sample. pH values were obtained from the melted snow prior to concentrating with the InnovaPrep CP. Sample Neg are the negative controls representing the “kitome.”

Sample

pH of snow

Temp (C) at collection

Elution vol (µL)

DNA concentration (ng/µL)

Total DNA recovered (ng)

Replicate 1

Replicate 2

Replicate 1

Replicate 2

1

6.9

-5.5

50

0.07

0.09

3.5

4.5

2

6.5

-5.7

50

0.08

0.12

4.0

6.0

3

7.0

-5.5

50

0.1

0

5.0

0

4

6.7

-5.5

50

0.4

0.33

20

16.5

5

6.6

-5.5

50

0.13

0.1

6.5

5.0

6

6.5

ND

50

0.11

0.2

5.5

10

7

6.4

ND

50

1.8

0.41

90

20.5

8

6.5

ND

50

0.18

0.43

9.0

21.5

9

6.5

ND

50

1.8

0.68

90

34

10

6.7

-3.6

50

0.91

0.14

45.5

7.0

11

7.2

ND

50

0.12

0.05

6.0

2.5

12

6.8

-4.1

50

3.0

3.5

150

175

13

6.5

ND

50

0

0.2

0

10

14

6.8

-3.2

50

0

0.01

0

0.5

LakeA (15)

6.6

-4.2

50

0.29

0.16

14.5

8.0

Neg

na

ND

50

0.06

0

3.0

0

DNA microbiome and metagenomics sequencing

Whole genome sequencing of total DNA recovered from each site was performed using both Illumina- and Nanopore-based sequencing. While Illumina sequencing had nearly 7 million reads for each sample to generate a complete microbiome profile, Nanopore was restricted to the direct sequencing of native DNA without amplification, unlike that of Illumina. Since only limited DNA was available from the extracted snow, very low read coverage was obtained from Nanopore sequencing but was available to complement the short read Illumina data. Therefore, mixed data types were not combined for analysis.

Figures 2A (Illumina Data) and 2B (Nanopore Data)
Greenland snow microbiome data for samples classified using One Codex database and further processed using custom python scripts. Reads filtered to bacterial and archaeal organisms and normalized as a proportion of the total reads above a 1% representation. Figure 2A depicts taxonomic annotation based on Illumina short read data, and Figure 2B depicts taxonomic annotation based on Nanopore direct DNA long read sequence data. Samples VT and 10D excluded from plot because they are not part of the Greenland “snow” microbiome.

These results show common prokaryotic genera detected by both Illumina and Nanopore sequencing (Fig. 2), as well as several eukaryotic lichen taxa (Supplemental Fig. 1). With the exception of sample 3, the core microbiome was shared between sequencing platforms. Illumina data identified Chlorogleoa Granulicella, Hymenobacter, Merosrhizobium, Methylobacterium, Nocardiodes, Nostoc, Pseudomonas, Pseudonocardia, Roseomonas, Solirubrobacter, Sphingnomonas, and Streptomyces while Nanopore data revealed Belnapia, Chlorogleoa, Granulicella, Hymenobacter, Merosrhizobium, Methylobacterium, Nocardiodes, Nostoc, Pseudomonas, Pseudonocardia , Roseomonas, Solirubrobacter, Sphingnomonas, and Streptomyces. Sample 3 showed only minor components of the core microbiome and was composed primarily of Neisseria, Rothia, and Streptococcus. All genera identified in the long read sequencing are also identified by short read sequencing, although the short read sequencing, which includes a greater depth of sequencing and a higher level of accuracy, identified more genera. The assessment of sample site similarity based on PCA using the Illumina whole genome sequencing results clearly indicates site differences based on genomic and species content (Fig. 3). Samples 10 and 10D contained moderate levels of debris and soil and deviated from other samples. The sample collected in a suburban site at the University of Vermont was significantly different from the “clean” snow samples. Also noted during sample collection, sample 12, collected near a rock ledge outcropping, contained minor levels of debris and grouped together with sample 7 and 9 which also contained greater levels of DNA.

Figure 3
Principal Component Analysis (PCA) is a model-free data reduction technique used to identify patterns in multidimensional data sets, and is here generated using genus-level classification and the trimmed mean of M-values (TMM.) The distance between sample sites indicates similarity or difference based on the greatest sources of variation in the data, and distinguish the “clean” snow samples from those with soil contamination (i.e., VT, 10 and 10D, and 12).

Culturing results

Snow samples were cultured on R2A agar at 4 °C for 30 days. Purified bacterial colonies were successfully recovered from sampling sites 1, 2, 3, 4, 5, 6, 7, 10, 10D, 12, and 15 and subsequently identified using Illumina and Nanopore whole genome shotgun sequencing. Data analysis was performed using the One Codex and National Center for Biotechnology Information BLAST Nucleotide databases (Table 4). Since samples 10 and 12 were collected near exposed ground and visually exhibited small amounts of debris, a greater diversity of culturable bacteria was observed. Sample 10D was intentionally collected at the snow-soil interface at site 10 and contained soil and plant debris to serve as a positive reference for higher microbial diversity. Sites 8, 9, 11, 13, and 14 yielded no growth on the R2A growth media. Staining, microscopic imaging, and colony morphology were recorded for all positive samples. Representative isolates are shown in Figure 4.

Table 4

Microbial characterization of purified isolates recovered from snow extracts and cultured on R2A media for 30 days at 4°. Colony enumerations are provided as estimates for cross sample comparison purposes. Microbial identification was accomplished using whole genome shotgun sequencing on an Illumina MiSeq instrument and analyzed using the NCBI BLAST database and One Codex software.  

Sample Site

Culture Data

Abundance

Colony Description

DNA Identification

1

Rare

Pink

Hymenobacter sp. PAMC 26554

2

Rare

Fungi - Dematiaceous

Aureobasidium pullulans

Rare

White

Stylodothis puccinioides

3

Abundant

Small Pink

Unknown Microbacteriaceae

Rare

Small Amber

Sphingomonas sp. (OK281)

4

Moderate

Small White

Variovorax sp.

Moderate

Small White

Robbsia andropogonis

5

Moderate

White Brown Center

Yeast

6

Rare

Tan

Yeast

7

Rare

Small Clear

Unknown

10

Moderate

Tan

Pseudomonas sp. K2I15

Low

White

Caballeronia sordidicola  

Rare

Yellow/Orange

Sphingomonas sp. 

Rare

Pink

Pseudomonas sp.

Rare

Tan

Unknown

Rare

Amber

Unknown Microbacteriaceae

12

Abundant

Yellow

Caballeronia sordidicola

Abundant

Small White

Janthinobacterium agaricidamnosum

Abundant

White

Duganella aceris

Moderate

White

Burkholderia sp. Leaf177

Abundant

Yellow

Unknown Oxalobacteraceae

Rare

Red

Sphingomonas sp. PAMC 26617

Abundant

Small Clear

Pseudomonas sp. PAMC 25886

Abundant

Large White

Mucilaginibacter polytrichastri

Rare

Small Yellow

Pseudomonas fluorescens

Abundant

Small White

Mucilaginibacter sp. PAMC 26640

Moderate

Clear

Mucilaginibacter sp. OK283  

15

Moderate

White Clear

Paenibacillus sp. IHBB 10380

Rare

White liquefaciens-like

Paenibacillus herberti  

Rare

Large tan

Paenibacillus herberti  

Rare

Small Translucent

Paenibacillus sp. IHBB 10380

Rare

White Convex

Paenibacillus pectinilyticus

Figure 4
Microscopic images, colony morphologies, and identifications for 12 representative taxa recovered by culturing. on R2A agar incubated for 30 days at 4°C. Identifications were determined using whole genome sequencing. Sample numbers indicate sample site and referenced in Table 4.

Negative control samples

Snow in the Ilulissat region is low biomass due to its remote nature, and most sequence reads from the negative controls are part of the “kitome” and other reagents related to the sampling process. Although organisms were detected at a low occurrence in these controls, representatives included Pseudomonas sp., Cutibacterium acnes, Rothia aeria, Cellulosimicrobium cellulans, Streptococcus mitis and salivarius, Acinetobacter junii, Lysobacter enzymogenes, Corynebacterium accolens and pseudogenitalium, Staphylococcus warneri, Tepidiphilus thermophilus, Malassezia restricta, and Streptococcus sanguinis. These are often found in many negative control microbiomes as part of the kitome.[2]

DISCUSSION

Technologies for the characterization of the microbial ecology of polar environments have made significant advances in recent years, including state-of-the-art sampling instruments that allow concentrations of large volumes of environmental liquids such as water, melted snow, or glacial run-off as well as high output massively parallel DNA sequencers able to identify microbes previously missed because of their resistance to culturing methods. These new instruments and techniques provide researchers with new genomic methods to characterize these ecosystems. In this study, snow was collected from undisturbed snow fields in Greenland and cultured using psychrophilic culture techniques and shotgun sequenced using both Illumina and Nanopore sequencing to reveal a comprehensive microbiome and to provide metagenomics data. Since microorganisms are present at limited concentrations in snow, it is imperative to concentrate as much snow as possible into small tubes to be transported to a genomics laboratory. The use of the InnovaPrep CP system provided an effective way to concentrate melted snow at large amounts (>2 L) in a field location that allowed for on-site sample preservation with ethanol. With an effective concentration of 4000-fold, DNA sequence data was obtained from both Nanopore and Illumina platforms to provide a comprehensive microbiome profile. Since the concentration factor is substantial, the use of DNA-free techniques and reagents were employed, including disposable RNase/DNase-free tubes, flame sterilized implements, and DNA-free MetaPolyzyme to digest microbial cells walls. Because not all reagents were DNA-free, the presence of background-contaminating, trace-level DNA was also concentrated (kitome) and required multiple negative controls to be processed simultaneously through the entire DNA sequencing procedures to account for nonbiologically relevant trace signal.

The results from this study revealed low concentrations of microorganisms and low yields of DNA from all samples except those exhibiting soil contamination. These site differences are also delineated in the PCA plot. The lack of cultured organisms was observed in many of the clean snow samples even though a microbiome is detected by sequencing. This is expected since it is reported that only a small percent of microorganisms can be cultured.[16] Of the cultured samples, many of the bacteria were pigmented psychrophiles, typical of organisms from this type of ecosystem. DNA sequences indicated a range of bacteria that were recovered but also considerable levels of lichens. Ilulissat has an abundant population of lichens on terrestrial surfaces near the coastlines, and the ability for these taxa to move through the environment by wind is expected. Microbiome data from both Nanopore and Illumina sequencing concurred with higher diversity from the Illumina data since the sequencing depth was between 50 to 100x greater.

The culture and microbiome data show little agreement for the same sample with DNA microbiome data revealing far greater microbial diversity at higher abundances. Organisms recovered by culturing were detected at <0.02% of the total reads for both Illumina and nanopore, with the exception of Variovorax and Hymenobacter, which were observed at slightly higher levels (data not shown).

This can be in part due to previously described cultivation biases and incomplete lysis during DNA extraction[2],[17] or the recovery of extracellular environmental DNA (eDNA). Nonetheless, when studying these novel environments both techniques remain essential for the robust characterization of these understudied microbial communities.

There were several significant challenges for this study, and here, we highlight two. First, the improper selection of the correct analytical pipeline can result in misclassification. It is often required to use nucleotide BLAST on extreme microbiome samples, since many of the organisms are only represented by one gene and not contained within a genome database. Nonetheless, in our samples, over half of the DNA were not mapped to known organisms. Since the purpose of our research is to study the microbiome and not describe new species, those data are not included here. Many more samples (and sequencing) are needed to assemble new genomes for the required deposition into genomics databases for future microbiome studies. Second, since the negative control sample contained DNA from multiple species, it was necessary to remove these sequences. It is a significant challenge to develop a strategy for handling ultra-low biomass microbiome samples such as polar snow, air, spacecraft, or even cleanroom facilitates.[18],[19] While methods such as whole genome amplification, for example, multiple displacement amplification, may be attractive, they also bias amplification and create additional challenges with interpretation.[20],[21] Future studies should consider collecting and concentrating as much snow as possible to minimize the need for any DNA amplification techniques. Despite the challenges, the microbiome of Arctic snow is compelling. Both culturing and sequencing data revealed unique microbiomes that require additional investigation into their community composition, metabolic capabilities, secondary metabolites, and biosynthetic gene cluster potential.

ACKNOWLEDGMENTS

The authors would like to thank the assistance provided by our field guide captain Edvard Samuelson of Ilulissat Boat and Tours for working with us to find a relevant sampling site, Axel Brandt at Hotel Arctic for allowing us the use of the facilities as a temporary sample processing space, Dr. Nitin Singh of the NASA Jet Propulsion Lab and Dr. David Danko of Biota, Inc. for critical bioinformatics support, and Dr. Diana Krawczyk of the Greenland Institute of Natural Resources for assisting with experiment design and site locations. The authors would also like to thank the permissions granted by the Government of Greenland Departementet for Erhverv, Energi og Forskning, Ministry of Industry, Energy and Research under permit G22-009 for the nonexclusive right for Utilization of Greenland Genetic Resources and approved Nogoya permit. Lastly, the authors would like to thank their research partners at the Cold Regions Research and Engineering Laboratory for scientific support and funding.

Research reported in this publication was supported in part by an Institutional Development Award (IDeA) from the National Institute of General Medical Sciences of the National Institutes of Health under grant number P20GM103449. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of NIGMS or NIH.

SUPPLEMENTAL MATERIAL

Supplemental Table 1.

Amount of genomic DNA input for the Nextera XT library preps and amount of final library used to pool samples for sequencing.

Pool group 1

Pool group 2

Pool group 3

Sample ID

Input amount (ng)

Pooled amount (ng)

Sample ID

Input amount (ng)

Pooled amount (ng)

Sample ID

Input amount (ng)

Pooled amount (ng)

4-1

1.0

10.0

2-2

0.66

5.0

1-1

0.39

5.0

4-2

1.0

10.0

3-1

0.55

5.0

1-2

0.50

5.0

7-1

1.0

10.0

5-1

0.72

5.0

2-1

0.44

5.0

7-2

1.0

10.0

5-2

0.55

5.0

3-2

0.00

5.0

8-2

1.0

10.0

6-1

0.61

5.0

11-2

0.25

5.0

9-1

1.0

10.0

6-2

1.00

10.0

13-1

0.00

5.0

9-2

1.0

10.0

8-1

1.00

10.0

14-1

0.00

5.0

10-1

1.0

10.0

10-2

0.77

5.0

14-2

0.06

5.0

10D1

1.0

10.0

11-1

0.66

5.0

N1

0.33

0.5

10D2

1.0

10.0

13-2

1.00

10.0

N2

0.00

0.5

12-1

1.0

10.0

A1

1.00

10.0

PBS2

0.33

0.5

12-2

1.0

10.0

A2

0.88

5.0

PBS3

0.00

0.5

Supplemental Figure 1
Nonbacterial taxa highly represented at all sites consist mainly of lichens routinely found in the Arctic and are indicators of clean oligotrophic ecosystems.

Comments
0
comment
No comments here
Why not start the discussion?