Insulated neighborhood

In mammalian biology, insulated neighborhoods are chromosomal loop structures formed by the physical interaction of two DNA loci bound by the transcription factor CTCF and co-occupied by cohesin.[1] Insulated neighborhoods are thought to be structural and functional units of gene control because their integrity is important for normal gene regulation. Current evidence suggests that these structures form the mechanistic underpinnings of higher-order chromosome structures, including topologically associating domains (TADs). Insulated neighborhoods are functionally important in understanding gene regulation in normal cells and dysregulated gene expression in disease.

Multiple levels of mammalian genome organization. Chromosomes occupy discrete territories in the nucleus (left). Topologically associating domains (TADs) are regions of the genome with locally high interaction frequency (center). Insulated neighborhoods are loops formed by the interaction of CTCF/cohesin-bound anchors containing genes and their regulatory elements.

Enhancer-gene targeting

Mammalian gene transcription is generally controlled by enhancers.[2][3][4][5][6] Enhancers can regulate transcription of genes at large distances by looping to physically contact their target genes. This property of enhancers makes it difficult to identify an enhancer's target gene(s). Insulators, another type of DNA regulatory element, limit an enhancer's ability to target distal genes when the insulator is located between an enhancer and a potential target.[7][8][9][10] In mammals, insulators are bound by CTCF,[11] but only a minority of CTCF-bound sites function as insulators.[12] CTCF molecules can form homodimers on DNA, which can be co-bound by cohesin; this chromatin loop structure helps constrain the ability of enhancers within the loop to target genes outside the loop. Loops with CTCF and cohesin at the start and end of the loop that restrict enhancer-gene targeting are "insulated neighborhoods."

Function

Insulated neighborhoods are defined as chromosome loops that are formed by CTCF homodimers, co-bound with cohesin, and containing at least one gene.[13][14] The CTCF/cohesin-bound regions delimiting an insulated neighborhood are called "anchors." One study in human Embryonic stem cells identified ~13,000 insulated neighborhoods that, on average, each contained three genes and was about 90kb in size.[15] Two lines of evidence argue that the boundaries of insulated neighborhoods are insulating: 1) the vast majority (~90-97%) of enhancer-gene interactions are contained within insulated neighborhoods and 2) genetic perturbation of CTCF/cohesin-bound insulated neighborhood anchors leads to local gene dysregulation due to novel interactions outside of the neighborhood.

The majority of insulated neighborhoods appear to be maintained during development because CTCF binding and CTCF-CTCF loop structures are very similar across human cell types.[16][17] While the location of many insulated neighborhood structures are maintained across different cell types, the enhancer-gene interactions occurring within them are cell-type specific, consistent with the cell type-specific activity of enhancers.[18][19]

Association with TADs

Topologically associating domains (TADs) are megabase-size regions of relatively high DNA interaction frequencies.[20][21] Mechanistic studies indicate TADs are single insulated neighborhoods or collections of insulated neighborhoods.[22]

Relevance to human disease

Genetic and epigenetic variation of insulated neighborhood anchors have been linked to several human diseases. One study of a genetic variant linked to asthma disrupts CTCF binding and insulated neighborhood formation.[23] Studies of imprinted loci showed DNA methylation controls CTCF-anchored loops regulating gene expression. Individuals with methylation aberrations at an imprinted CTCF-binding site near IGF2/H19 form aberrant Insulated Neighborhoods and develop Beckwith-Wiedemann syndrome (when both alleles have the paternal type of insulated neighborhood) or Silver-Russell syndrome (when both alleles have the maternal type of insulated neighborhood).[24]

Insulated neighborhoods aid in identifying the target genes of disease-associated enhancer variants. The majority of disease-linked DNA variants identified from genome-wide association studies occur in enhancers.[25][26][27][28] Identifying target genes of enhancers with disease-linked variants has been difficult because enhancers may act over long distances, but the constraint on enhancer-gene targeting by insulated neighborhoods refines the prediction of target genes. For example, a DNA variant associated with type 2 diabetes occurs within an enhancer located between the CDC123 and CAMK1D genes but only affects CAMK1D because this gene and the enhancer are within the same insulated neighborhood, while CDC123 lies outside the neighborhood.[29][30]

Somatic mutations that alter insulated neighborhood anchors can contribute to tumorigenesis. Chromosomal alterations such as translocations, deletions and tandem duplications intersecting with insulated neighborhood anchor sites can activate oncogenes.[31][32][33] Epigenetic dysregulation can also contribute to tumorigenesis by altering insulated neighborhoods. IDH-mutant gliomas display altered DNA methylation patterns, so CTCF binding, which is DNA methylation-dependent, is also altered.[34] Altered CTCF-binding disrupts insulated neighborhoods and can lead to oncogene misregulation.

References