Epigenomics: ChIP-Seq, DNase-Seq, FAIRE, ATAC-Seq, Nucleosome positioning

Table of contents

1.1. ChIP-Seq

1.2. DNase-Seq

1.3. ATAC-Seq

1.4. FAIRE-Seq

2. Nucleosome Positioning [edit]

Back to index

ChIP-Seq

(Cecilia Boretto)

ChIP-Seq identifies the binding sites of DNA-associated proteins and can be used to map global binding sites for a given protein.

ChIP-seq protocol:

the first stage is to stabilize the link between proteins and DNA (DNA-protein crosslinking) thanks to formaldehyde;
wash and collect the cells with PBS follows;
the second stage is to fragment the DNA (with the bound proteins) thanks to a process known as "sonication" in lisis buffer (sonication occurs in different sonication and pause cycles, usually 12, in order to avoid the formation of foam that could escape from the eppendorf causing sample loss);
the third stage consists in the addition of a specific antibody for the protein of interest, the antibody is linked to a beads (sepharose or magnetic beads) which, thanks to its weight, deposits the antibody-protein-DNA complex on the bottom of the eppendorf;
the complexes are then collected and purified from non-specific proteins;
the last stage is the removal of the DNA protein bond thanks to the protease K;
the extracted DNA fractions can then be sequenced with NGS
after sequencing these can be aligned to the genome
after the alignment the peak is identified

Advantage:

ChIP-Seq does not require prior knowledge
ChIP-Seq delivers genome-wide profiling with massively parallel sequencing, generating millions of counts across multiple samples for cost-effective, precise, unbiased investigation of epigenetic patterns
Captures DNA targets for transcription factors or histone modifications across the entire genome of any organism
Defines transcription factor binding sites
Reveals gene regulatory networks in combination with RNA sequencing and methylation analysis
Offers compatibility with various input DNA samples

Disadvantage:

Large Scale assays using ChIP is challenging using intact model organisms. This is because antibodies have to be generated for each TF, or, alternatively, transgenic model organisms expressing epitope-tagged TFs need to be produced
Researchers studying differential gene expression patterns in small organisms also face problems as genes expressed at low levels, in a small number of cells, in narrow time window
ChIP experiments cannot discriminate between different TF isoforms (Protein isoform)

More informations can be found here:

DNase-Seq

(Emilia Petrachi)

DNase-Seq is one of the several approaches in molecular biology useful to identify DNA response elements, or regulatory regions in general, through genome-wide sequencing of regions sensitive to cleavage by DNase I.
A brief outline of the technique is the following:

DNA-protein complexes are treated with DNase I;
DNA extraction and sequencing are perfomed;
Sequences bound by regulatory proteins are protected from DNase I digestion;
Deep sequencing is performed to provide accurate representation of location of regulatory proteins in the genome.

Pros

Can detect open chromatin
No prior knowledge of the sequence or binding protein is required
Compared to formaldehyde-assisted isolation of regulatory elements and sequencing (FAIRE-seq), has greater sensitivity at promoters

Cons

DNase l is sequence-specific and hypersensitive sites might not account for the entire genome
DNA loss through the multiple purification steps limits sensitivity
Integration of DNase I with ChIP data is necessary to identify and differentiate similar protein-binding sites

More information can be found at this website:

https://emea.illumina.com/science/sequencing-method-explorer/kits-and-arrays/dnase-seq-dnasel-seq.html

And this is a video made by a Biology Professor at Davidson College, it explains the protocol in really easy terms:

Brief outline of the DNase-Seq protocol

Another outline of the protocol

-----

ATAC-Seq

(Marianna Saviozzi)

Assay for Transposase Accessible Chromatin with high throughput sequencing is a method for mapping CHROMATIN ACCESSIBILITY genome-wide. It makes use of an hyperactive version of the bacterial Tn5 transposase pre-loaded with sequencing adapters, that are inserted into accessible regions of chromatin. In physiological conditions Tn5 transfers a DNA fragment from a genomic latation to another: in this application it is pre-loaded with 2 sequencing adapters therefore their insertion into the accessible chromatin regions leads to genome fragmentation (tagmentation). These fragments are then PCR amplified and sequenced by using NGS technologies. The sequencing peacks correspond to open chromatin since sequencing starts from the accessible sites where Tn5 has inserted the adapters.

Pros:

Fast, simple and sensitive approach (preparation can be completed in 3 hours)
Works with many cell types and species
Requires no sonication, phenol-chlorophorm extraction (FAIRE), or antibodies (ChIP-Seq)
Modifications have been made to the protocol in order to perform single-cell analysis.

Cons:

The number of cells must be optimized from the beginning: too few cells leads to under-transposition while too many leads to over-transposition (for studies on human cells 500-50000 cells are recommended but the optimal number may vary according to cell type and species)

Applications:

Nucleosome mapping: identification of changes in nucleosome position during differentiation or between experimental conditions, and correlation with sequence context.
Transcription factors occupancy analysis: information complementary to FAIRE ans Dnase-Seq outputs.
Identification of novel enhancers during development.
Deep study of the genomic prophile associated to pathological conditions such as cancer

FAIRE-Seq

(Michela Anfossi)

FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements).
Chromatin is crosslinked with formaldehyde in vivo, sheared by sonication, and phenol-chloroform extracted. The DNA recovered in the aqueous phase is fluorescently labeled and hybridized to a DNA microarray. FAIRE performed in human cells strongly enriches DNA coincident with the location of DNaseI hypersensitive sites, transcriptional start sites, and active promoters. Evidence for cell-type–specific patterns of FAIRE enrichment is also presented.
FAIRE has utility as a positive selection for genomic regions associated with regulatory activity, including regions traditionally detected by nuclease hypersensitivity assays.

This assay extracts the non cross-linked DNA and only these nucleosome-depleted regions will be purified, enriched and sequenced. FAIRE-extracted DNA fragments can be analyzed in a high- throughput way using next-generation sequencing.
Enrichment by FAIRE has a very strong negative correlation with nucleosome occupancy (Hogan et al. 2006), as measured by comparison with nucleosome ChIP–chip experiments.

Pros:

Simple and highly reproducible protocol
Does not require antibodies
Does not require enzymes, such as DNase or MNase, avoiding the optimization and extra steps necessary for enzymatic processing
Does not require a single-cell suspension or nuclear isolation, so it is easily adapted for use on tissue samples

Cons:

Cannot identify regulatory proteins bound to DNA
DNase-Seq may be better at identifying nucleosome-depleted promoters of highly expressed genes

image taken from: https://www.illumina.com/science/sequencing-method-explorer/kits-and-arrays/faire-seq-sono-seq.html

Video on FAIRE-seq:
https://www.jove.com/video/57272/formaldehyde-assisted-isolation-regulatory-elements-to-measure

Nucleosome Positioning [edit]

(Francesca Castaldo)

Nucleosomes are composed of a segment of DNA ca 147 bp long that is wrapped around a histone protein octamer and they serve as basic unit of chromatin . Nucleosome positioning refers to the relative position of DNA double helix with respect to histone octamer, and it has a fundamental role in :

- Transcription

- DNA replication

- Other DNA transactions (packing of DNA into nucleosomes leads to an occlusion of the binding sites for several proteins )

Nucleosomes are usually depleted around TSSs, this resulting in the presence of a Nucleosome Free Region (NFR) flanked by H2A.Z containing nucleosomes. The first nucleosome downstream of the Transcription Start Site (the +1 nucleosome) is strongly localized, and nucleosomes at the 5′ end of a gene are generally better-localized than nucleosomes in the middle of genes.

CIS DETERMINANTS OF NUCLEOSOME POSITIONING

The histone octamer shows low sequence preference intended as a classical binding motif. The need to have DNA wrapped almost twice around protein octamers, though, represents a constraint and means that the energy required to bend a given genomic sequence does influence the binding affinity of the histone octamer. Since structural properties of DNA, such as local bendability, depend on DNA sequence, it is expected that DNA sequence will at least partially contribute to nucleosome positioning. The structure of poly-dA/dT sequences differs from the canonical double helix and is presumed to be resistant to the distortions necessary for wrapping around nucleosomes.

TRANS DETERMINANTS OF NUCLEOSOME POSITIONING

The modest success of sequence-driven nucleosome positioning algorithms suggest the involvement of trans factors in the positioning of a significant fraction of nucleosomes. Broadly, three major classes of trans factors have been implicated in nucleosome positioning/occupancy:

1. Transcription factors especially the abundant multifunctional regulators known as “general regulatory factors” (GRFs)

2. Chromatin remodelers : have an ATPase domain and use the energy of ATP hydrolysis to slide nucleosomes laterally or eject them from DNA among other activities

3. RNA polymerase

NUCLEOSOME POSITIONING AND GENE REGULATION

Nucleosomes have classically been thought to prevent transcription factor binding to motifs located within the nucleosome, and for most transcription factors this appears to be true. In vivo, most functional transcription factor binding motifs are in nucleosome-free regions.

To get more info you can check these articles :

- https://doi.org/10.1016/j.ydbio.2009.06.012

- DOI: 10.5772/intechopen.70935

(ALESSIA RUBIOLA)

Here a video that explain MNase-seq to study nucleosome positioning:

MNase is the micrococcal nuclease I, it induce single strand breaks and subsequentaly double strand breaks that are close to the first ones. MNase continue to digest the exposed DNA until it reach an obstruction, the nucleosome. It generate fragments of about 147 bp, this fragments are the DNA sequence protected by the nucleosome. After sequencing the reads can be alligned with the genome to determine wich DNA regions are bound by nucleosomes.

Advanced Molecular Biology 2019-2020
Student Wiki on methodology

IMPORTANT !!!