Student Wiki on methodology

This Wiki is intended to collectively make the point on methodologies employed in research papers we analyze during the course. "Writers" are students who wish to contribute to a specific subject. Before contributing, please add your name in the "Writers group choice". When initiating a contribution, please indicate your name in brackets.


PLEASE:  DO NOT change the INDEX page !!!
This page contains the links to the seven official subjects, which are the same in the Choice.

To contribute, go to the right page by clicking on the description here in the index, then click EDIT and contribute. At the end, please save.

 



NGS

Viewing page version #6
(Restore this version) 

Modified: 18 March 2018, 2:17 PM   User: Cecilia Thairi  → 

index

Videos are available in the very last section "Auxiliaries".

Subjects to be discussed;

1) When using short-read with high numbers of reads; when using longer-reads with lower numbers?

2) what is Paired-end sequencing ?

3) NGS sequencing of very short RNAs (e.g. micro-RNAs, tRNAs etc.)

4) "exome" sequencing and "Targeted" sequencing

5) How does Pacific Bio long-run sequencing work ?

6) How does Nanopore sequencing work ?

7) other (your proposals)

Next-generation sequencing (NGS) is a massively parallel sequencing technology that has revolutionized the biological sciences. With its ultra-high throughput, scalability, and speed, NGS enables researchers to perform a wide variety of applications and study biological systems at a level never before possible.  The most important innovations of NGS regard the (1) lack of DNA/RNA fragment cloning, (2) the use of micro- or nano-reactors immobilized on distinct solid supports, enabling a very high level of parallelization of in situ sequencing and (3) the absence of electrophoretic separation of fragments because once nucleotides are incorporated in the sequencing reaction they are simultaneously identified.

To be analyzed the DNA is fragmented with chemical or enzymatic methods in defined fragments obtaining library of fragments that are subjected to covalent link with adaptators and subsequently used for the clonal amplification and sequencing.

The length of the reads can be different depending on the NGS technology exploited: Illumina’ reads length is around 100/150, for Abi SOLiD is 85, for 454 is around 700, exc. Short reads are mainly used for the re-sequencing when mapping against a reference genome, to explore genetic variations as SNVs, indels, CNVs; whereas longer reads are used for de novo sequencing, starting for primary data lacking a genome of reference.

The paired-end sequencing allows users to sequence both ends of a fragment: after sequencing using a primer for a specific strand of it, the following step uses the opposite primer to sequence the anti-strand. It generates high-quality, alignable sequence data.  Moreover, it facilitates detection of genomic rearrangements (as insertions, deletions and inversions) and repetitive sequence elements, as well as gene fusions and novel transcripts. Paired-end DNA sequencing reads provide superior alignment across DNA regions containing repetitive sequences, and produce longer contigs for de novo sequencing by filling gaps in the consensus sequence. The contigs can in turn be ordered to form scaffold on the basis of information of connectivity provided by the pairs of sequences that derive from the ends of the same clone (paired-end). In fact, if the two portions of a paired-end sequence map on two different contigs it is possible that those contigs are contiguous in the genome.

(Federica Galvagno)

  

NGS of short RNA

RNA sequencing (RNA-Seq) uses the capabilities of high-throughput sequencing methods to provide insight into the transcriptome of a cell. Beyond quantifying gene expression, the data generated by RNA-Seq facilitate the discovery of novel transcripts, identification of alternatively spliced genes, and detection of allele-specific expression.

MicroRNAs (miRNAs) are small RNA molecules of 17 to 24 bp that play an important role in the regulation of gene expression by modulating translation and stability of mRNA. 

With small RNA-Seq it is possible to discover novel miRNAs and other small noncoding RNAs, and examine the differential expression of all small RNAs in any sample. It is possible to characterize variations such as isoforms of miRNAs (isomirs) with single-base resolution, as well as analyse any small RNA or miRNA without prior sequence or secondary structure information.

Generating miRNA sequencing libraries directly from total RNA help to understand the role of noncoding RNA:

  • Understand how post-transcriptional regulation contributes to phenotype
  • Identify novel biomarkers
  • Capture the complete range of small RNA and miRNA species

 Typical RNA-Seq experiment consists of isolating RNA, converting it to complementary DNA (cDNA), preparing the sequencing library, and sequencing it on an NGS platform.

Because small RNAs are lowly abundant, short in length (15–30 nt), and lack polyadenylation, a separate strategy is often preferred to profile these RNA species. To extract the miRNA fragments from the total RNA, a size selection is performed: the total RNA is run on an agarose gel and the band corresponding to the size of miRNAs is cut out for further processing. This procedure excludes all bigger fragments, including all mRNAs and also rRNAs from the samples. In the next step, the sequencing adapters are ligated to the size-selected RNA molecules, followed by reverse transcription to cDNA. The thus obtained cDNA library is run on an agarose gel again.

The output of a next generation miRNA sequencing experiment will typically contain millions of short reads. All produced reads are aligned to the reference genome of the sequenced organism and all reads whose first part perfectly matches the reference are kept as potential miRNA reads. The remaining reads are discarded from further analysis. After the abundances of miRNAs are quantified for each sample, their expression levels can be compared between samples. One would then be able to identify miRNA that are preferentially expressed that particular time points, or in particular tissues or disease states. After normalizing for the number of mapped reads between samples, one can use a host of statistical tests to determine differential expression.

Another approach is to identifying a miRNA’s mRNA targets, to understand genes whose expression they regulate. Public databases provide predictions of miRNA targets. Besides, since micro RNAs are important regulators of many cellular processes such as survival, proliferation, and differentiation, they are usually involved in various aspects of cancer through the regulation of oncogene and tumor suppressor gene expression. In combination with the development of high-throughput profiling methods, miRNAs have been identified as biomarkers for cancer classification, response to therapy, and prognosis.


(Cecilia Thairi)