NGS

Back

Viewing page version #13
(Restore this version)

Modified: 18 March 2018, 4:06 PM User: Daniele Garelli →

index

Videos are available in the very last section "Auxiliaries".

Subjects to be discussed;

1) When using short-read with high numbers of reads; when using longer-reads with lower numbers?

2) what is Paired-end sequencing ?

3) NGS sequencing of very short RNAs (e.g. micro-RNAs, tRNAs etc.)

4) "exome" sequencing and "Targeted" sequencing

5) How does Pacific Bio long-run sequencing work ?

6) How does Nanopore sequencing work ?

7) other (your proposals)

Next-generation sequencing (NGS) is a massively parallel sequencing technology that has revolutionized the biological sciences. With its ultra-high throughput, scalability, and speed, NGS enables researchers to perform a wide variety of applications and study biological systems at a level never before possible. The most important innovations of NGS regard the (1) lack of DNA/RNA fragment cloning, (2) the use of micro- or nano-reactors immobilized on distinct solid supports, enabling a very high level of parallelization of in situ sequencing and (3) the absence of electrophoretic separation of fragments because once nucleotides are incorporated in the sequencing reaction they are simultaneously identified.

To be analyzed the DNA is fragmented with chemical or enzymatic methods in defined fragments obtaining library of fragments that are subjected to covalent link with adaptators and subsequently used for the clonal amplification and sequencing.

The length of the reads can be different depending on the NGS technology exploited: Illumina’ reads length is around 300, for Abi SOLiD is 75, for 454 is around 1000, exc. Short reads are mainly used for the re-sequencing when mapping against a reference genome, to explore genetic variations as SNVs, indels, CNVs; whereas longer reads are used for de novo sequencing, starting for primary data lacking a genome of reference.

The paired-end sequencing allows users to sequence both ends of a fragment: after sequencing using a primer for a specific strand of it, the following step uses the opposite primer to sequence the anti-strand. It generates high-quality, alignable sequence data. Moreover, it facilitates detection of genomic rearrangements (as insertions, deletions and inversions) and repetitive sequence elements, as well as gene fusions and novel transcripts. Paired-end DNA sequencing reads provide superior alignment across DNA regions containing repetitive sequences, and produce longer contigs for de novo sequencing by filling gaps in the consensus sequence. The contigs can in turn be ordered to form scaffold on the basis of information of connectivity provided by the pairs of sequences that derive from the ends of the same clone (paired-end). In fact, if the two portions of a paired-end sequence map on two different contigs it is possible that those contigs are contiguous in the genome.

(Federica Galvagno)

NGS of short RNA

RNA sequencing (RNA-Seq) uses the capabilities of high-throughput sequencing methods to provide insight into the transcriptome of a cell. Beyond quantifying gene expression, the data generated by RNA-Seq facilitate the discovery of novel transcripts, identification of alternatively spliced genes, and detection of allele-specific expression.

MicroRNAs (miRNAs) are small RNA molecules of 17 to 24 bp that play an important role in the regulation of gene expression by modulating translation and stability of mRNA.

With small RNA-Seq it is possible to discover novel miRNAs and other small noncoding RNAs, and examine the differential expression of all small RNAs in any sample. It is possible to characterize variations such as isoforms of miRNAs (isomirs) with single-base resolution, as well as analyse any small RNA or miRNA without prior sequence or secondary structure information.

Generating miRNA sequencing libraries directly from total RNA help to understand the role of noncoding RNA:

Understand how post-transcriptional regulation contributes to phenotype
Identify novel biomarkers
Capture the complete range of small RNA and miRNA species

Typical RNA-Seq experiment consists of isolating RNA, converting it to complementary DNA (cDNA), preparing the sequencing library, and sequencing it on an NGS platform.

Because small RNAs are lowly abundant, short in length (15–30 nt), and lack polyadenylation, a separate strategy is often preferred to profile these RNA species. To extract the miRNA fragments from the total RNA, a size selection is performed: the total RNA is run on an agarose gel and the band corresponding to the size of miRNAs is cut out for further processing. This procedure excludes all bigger fragments, including all mRNAs and also rRNAs from the samples. In the next step, the sequencing adapters are ligated to the size-selected RNA molecules, followed by reverse transcription to cDNA. The thus obtained cDNA library is run on an agarose gel again.

The output of a next generation miRNA sequencing experiment will typically contain millions of short reads. All produced reads are aligned to the reference genome of the sequenced organism and all reads whose first part perfectly matches the reference are kept as potential miRNA reads. The remaining reads are discarded from further analysis. After the abundances of miRNAs are quantified for each sample, their expression levels can be compared between samples. One would then be able to identify miRNA that are preferentially expressed that particular time points, or in particular tissues or disease states. After normalizing for the number of mapped reads between samples, one can use a host of statistical tests to determine differential expression.

Another approach is to identify a miRNA’s mRNA targets, to understand genes whose expression they regulate. Public databases provide predictions of miRNA targets. Besides, since microRNAs are important regulators of many cellular processes such as survival, proliferation, and differentiation, they are usually involved in various aspects of cancer through the regulation of oncogene and tumor suppressor gene expression. In combination with the development of high-throughput profiling methods, miRNAs have been identified as biomarkers for cancer classification, response to therapy, and prognosis.

(Cecilia Thairi)

Nanopore sequencing

Nanopore systems offer real-time, scalable, direct DNA sequencing, in which the user chooses fragment length and the nanopore sequences the entire fragments. Using nanopore sequencing, a single molecule of DNA or RNA can be sequenced without the need for PCR amplification or chemical labeling of the sample. This technology is based on use of electrophoresis to transport an unknown sample through an orifice of 10−9 meters in diameter that is called Nanopore. A nanopore is included in a specific membrane and system always contains an electrolytic solutions, when a constant electric field is applied, an electric current can be observed in the system. The magnitude of the electric current density across a nanopore surface depends on the nanopore's dimensions and the composition of DNA or RNA that is occupying the nanopore. Samples cause characteristic changes in electric current density across nanopore surfaces, so it is possible recognize the specific nucleotides that compose the sample examining these alterations: each nucleotide has its specific profile of electric current alteration.

Substancially nanopore is set in an electrically resistant membrane. If an analyte passes through the pore or near its aperture, this event creates a characteristic disruption in current. Measurement of that current makes it possible to identify the molecule in question.

Holes can be created by proteins puncturing membranes (biological nanopores) or in solid materials (solid-state nanopores). Biological nanopore sequencing relies on the use of transmembrane proteins, called porins, that are embedded in lipid membranes, alpha hemolysin (αHL), a nanopore from bacteria that causes lysis of red blood cells, is advantageous to identify specific bases moving through the pore: effectiveness can be improved with coupling of an exonuclease onto the αHL pore. The enzyme would periodically cleave single bases, enabling the pore to identify successive bases. Mycobacterium smegmatis porin A (MspA) is the second biological nanopore currently being investigated for DNA sequencing. The MspA pore has been identified as a potential improvement over αHL due to a more favorable structure because the natural nanopore was modified to improve translocation by replacing three negatively charged aspartic acids with neutral asparagines.

Solid state nanopore sequencing approaches do not incorporate proteins into their systems. Instead, solid state nanopore technology uses various metal or metal alloy substrates with nanometer sized pores that allow DNA or RNA to pass through. Measurement of electron tunneling through bases as ssDNA translocates through the nanopore is an improved solid state nanopore sequencing method, most research has focused on proving bases could be determined using electron tunneling. These studies were conducted using a scanning probe microscope as the sensing electrode, and have proved that bases can be identified by specific tunneling currents.

(Daniele Garelli)

Advanced Molecular Biology 2017-2018Student Wiki on methodology

NGS

Advanced Molecular Biology 2017-2018
Student Wiki on methodology