Saturday, July 11, 2020

DNA Binding Domains- Introduction and its Types.

What is DNA?

DNA is a molecule composed of two polynucleotide chains that coil around each other forming a double helix structure carrying genetic instructions for the development, functioning, growth and reproduction of all known organisms and many viruses. DNA and ribonucleic acid (RNA) are nucleic acids.

Alongside proteins, lipids and complex carbohydrates (polysaccharides), nucleic acids are one of the four major types of macro molecules which are essential for all known forms of life.

What is DNA binding domain?

DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif. The motif recognizes double- or single-stranded DNA. In a chain-like biological molecule, such as a protein or nucleic acid, a structural motif is a super secondary structure, which also appears in a variety of other molecules. Motifs dont allow to predict the biological functions: motifs are found in proteins and enzymes with dissimilar functions. A DBD can recognize a specific DNA sequence (a recognition sequence) or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.


What is DNA binding protein?

DNA-binding proteins are proteins that have DNA-binding domains and as a result have a specific or general affinity for single- or double-stranded DNA. A sequence-specific DNA-binding proteins mainly interact with the major groove of B-DNA, since it exposes more functional groups that identify a base pair. However, there are some known minor groove DNA-binding ligands such as netropsin, distamycin, Hoechst 33258, pentamidine, DAPI and others.
One or more DNA-binding domains consist of further protein domains with differing function. The extra domains often regulate the activity of the DNA-binding domain. The function of DNA binding is either structural or involves transcription regulation, with the two roles sometimes overlapping.

DNA-binding domains with functions involving DNA structure have biological roles in DNA replication, repair, storage, and modification, such as methylation.

DNA methylation is a biological process by which methyl groups are added to the DNA molecule. Methylation can change the activity of a DNA segment without changing sequence. When dNA methylation, located in a gene promoter, typically acts to repress gene transcription. In mammals, DNA methylation is essential for normal development and is associated with a number of key processes including genomic imprinting, X-chromosome inactivation, repression of transposable elements, aging, and carcinogenesis.

Many proteins involved in the regulation of gene expression contain DNA-binding domains. For example, proteins that regulate transcription (Transcription is the first of several steps of DNA based gene expression in which a particular segment of DNA is copied into RNA (especially mRNA) by the enzyme RNA polymerase)  by binding DNA are called transcription factors. The final output of most cellular signaling cascades) is gene regulation.

(Signal transduction or signaling cascade is the process by which a chemical or physical signal is transmitted through a cell as a series of molecular events, most commonly protein phosphorylation catalyzed by protein kinases, which ultimately results in a cellular response. Proteins responsible for detecting stimuli are termed receptors, although in some cases the term sensor is used. Regulation of gene expression, or gene regulation includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products like protein or RNA.

The DBD interacts with the nucleotides of DNA in a DNA sequence-specific or non-sequence-specific manner, but even non-sequence-specific recognition involves some sort of molecular complementarity between protein and DNA. DNA recognition by the DBD can occur at the major or minor groove of DNA or at the sugar-phosphate DNA backbone. Each specific type of DNA recognition is tailored to the protein's function. For example, DNAse I,the DNA-cutting enzyme cuts DNA almost randomly and so must bind to DNA in a non-sequence-specific manner. But, even , DNAse I recognizes a certain 3-D DNA structure, yielding a almost specific DNA cleavage pattern that can be useful for studying DNA recognition by a technique called DNA footprinting.

Many DNA-binding domains must recognize specific DNA sequences, such as DBDs of transcription factors that activate specific genes, or those of enzymes that modify DNA at specific sites, like restriction enzymes and telomerase. In the DNA major groove, the hydrogen bonding pattern is less degenerate than that of the DNA minor groove, providing a more attractive site for sequence-specific DNA recognition.

The specificity of DNA-binding proteins can be also studied using many biochemical and biophysical techniques, such as gel electrophoresis, analytical ultracentrifugation, calorimetry, DNA mutation, protein structure mutation or modification, nuclear magnetic resonance, x-ray crystallography, surface plasmon resonance, electron paramagnetic resonance, cross-linking.


A large fraction of genes in each genome encodes DNA-binding proteins.

Types of DNA binding domains:

(1) Helix-turn-helix DNA binding motif:


In proteins, the helix-turn-helix (HTH) is a major structural motif capable of binding DNA. Each monomer incorporates two α helices which is joined by a short strand of amino acids, that bind to the major groove of DNA. The HTH motif occurs in many proteins that regulate gene expression. This should not be confused with the helix-loop-helix motif.

The discovery of the helix-turn-helix motif was based on similarities between several genes encoding transcription regulatory proteins from bacteriophage lambda and Escherichia coli: Cro, CAP, and λ repressor, which were found to share a common amino acid (20-25) sequence that facilitates DNA recognition.

The helix-turn-helix motif is a DNA-binding motif. The recognition and binding to DNA by helix-turn-helix proteins is carried out by the two α helices, one occupying the N-terminal end of the motif and the other at the C-terminus. In most of such cases as in the Cro repressor, the second helix contributes most to DNA recognition, and hence it is often called the "recognition helix". It binds to the major groove of DNA through a series of hydrogen bonds and various Van der Waals interactions with exposed bases. The other α helix stabilizes the interaction between protein and DNA but doesnt play a particularly strong role in its recognition. The recognition helix and its preceding helix always have the same relative orientation.

Classification of helix-turn-helix motifs:


1. Di-helical:

The di-helical helix-turn-helix motif is the simplest helix-turn-helix motif. A fragment of Engrailed homeo domain encompassing only the two helices and the turn was seemed to be an ultrafast independently folding protein domain.


2.Tri-helical:

An example of this motif is found in the Transcriptional activator Myb.


3. Tetra-helical:

The tetra-helical helix-turn-helix motif has an additional C-terminal helix compared to the tri-helical motifs. These include the LuxR-type DNA-binding HTH domain found in bacterial transcription factors and the helix-turn-helix motif found in the TetR repressors. Multihelical versions with additional helices also occur.


4. Winged helix-turn-helix:

The winged helix-turn-helix (wHTH) motif is formed by a 3-helical bundle and a 3- or 4-strand beta-sheet (wing). The topology of helices and strands in the wHTH motifs may vary. In the transcription factor ETS wHTH folds into a helix-turn-helix motif on a four-stranded anti-parallel beta-sheet scaffold arranged in the order α1-β1-β2-α2-α3-β3-β4 where the third helix is the DNA recognition helix.

5.Other modified helix-turn-helix motifs:

Other derivatives of the helix-turn-helix motif include the DNA-binding domain found in MarR, a regulator of multiple antibiotic resistance, which forms a winged helix-turn-helix with an additional C-terminal alpha helix.

(2) Zinc finger:


A zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn2+) in order to stabilize the fold.  It was originally coined to describe the finger-like appearance of a hypothesized structure from Xenopus laevis transcription factor IIIA, the zinc finger name has now come to encompass a wide variety of differing protein structures. Xenopus laevis TFIIIA was originally demonstrated to contain zinc and require the metal for function in 1983, the first such reported zinc requirement for a gene regulatory protein. It often appears as a metal-binding domain in multi-domain proteins.

Proteins that contain zinc fingers (zinc finger proteins) are classified into several different structural families. Unlike many other clearly defined super secondary structures such as Greek keys or β hairpins, there are a number of types of zinc fingers, each with a unique three-dimensional architecture. A particular zinc finger protein's class is determined by this three-dimensional structure, but it can also be recognized based on the primary structure of the protein or the identity of the ligands coordinating the zinc ion.(a ligand is an ion or molecule (functional group) that binds to a central metal atom to form a coordination complex. The bonding with the metal generally involves formal donation of one or more of the ligand's electron pairs. The nature of metal–ligand bonding can range from covalent to ionic.)

Since their original discovery and the elucidation of their structure, these interaction modules have proven ubiquitous in the biological world and may be found in 3% of the genes of the human genome. In addition, zinc fingers have become extremely useful in various therapeutic and research capacities.

Initially, the term zinc finger was used solely to describe DNA-binding motif found in Xenopus laevis; however, it is now used to refer to any number of structures related by their coordination of a zinc ion. In general, zinc fingers coordinate zinc ions with a combination of cysteine and histidine residues. Originally, the number and order of these residues was used to classify different types of zinc fingers ( e.g., Cys2His2, Cys4, and Cys6). More recently, a more systematic method has been used to classify zinc finger proteins instead. This method classifies zinc finger proteins into "fold groups" based on the overall shape of the protein backbone in the folded domain. The most common "fold groups" of zinc fingers are the Cys2His2-like (the "classic zinc finger"), treble clef, and zinc ribbon.

The following table shows the different structures and their key features:


FOLD GROUP                                   LIGAND PLACEMENT


 1.Cys2His2     -  Two ligands from a knuckle and two more from the c  terminus of a helix.   


2.Gag knuckle  -  Two ligands from a knuckle and two more from a short helix or loop.


3.Treble clef    -  Two ligands from a knuckle and two more from the N terminus of a helix.


4.Zinc ribbon   -  Two ligands each from two knuckles.


5.Zn2/Cys6      -  Two ligands from the N terminus of a helix and two more from a loop.

6.TAZ2 domain like  -  Two ligands from the termini of two helices.


1.Cys2His2:

This class of zinc fingers can have a variety of functions such as binding RNA and mediating protein-protein interactions, but is best known for its role in sequence-specific DNA-binding proteins such as Zif268.

X2-Cys-X2,4-Cys-X12-His-X3,4,5-His

In such proteins, individual zinc finger domains typically occur as tandem repeats with two, three, or more fingers comprising the DNA-binding domain of the protein. These tandem arrays can bind in the major groove of DNA and are typically spaced at 3-bp intervals. The α-helix of each domain (often called the "recognition helix") can make sequence-specific contacts to DNA bases; residues from a single recognition helix can contact four or more bases to yield an overlapping pattern of contacts with adjacent zinc fingers.

2.Gag knuckle:

This fold group is defined by two short β-strands connected by a turn (zinc knuckle) followed by a short helix or loop and resembles the classical Cys2His2 motif with a large portion of the helix and β-hairpin truncated.

The retroviral nucleocapsid (NC) protein from HIV and other related retroviruses are examples of proteins possessing these motifs. The gag-knuckle zinc finger in the HIV NC protein is the target of a class of drugs known as zinc finger inhibitors.

3.Treble clef:

The treble-clef motif consists of a β-hairpin at the N-terminus and an α-helix at the C-terminus that each contribute two ligands for zinc binding, although a loop and a second β-hairpin of varying length and conformation can be present between the N-terminal β-hairpin and the C-terminal α-helix. These fingers are present in a diverse group of proteins that frequently do not share sequence or functional similarity with each other. The best-characterized proteins containing treble-clef zinc fingers are the nuclear hormone receptors.

4.Zinc ribbon:


The zinc ribbon fold is characterised by two beta-hairpins forming two structurally similar zinc-binding sub-sites.

5.Zn2/Cys:

The canonical members of this class contain a binuclear zinc cluster in which two zinc ions are bound by six cysteine residues. These zinc fingers can be found in several transcription factors including the yeast Gal4 protein.

(3)Leucine zippers:

Leucine zippers are a dimerization motif of the bZIP (Basic-region leucine zipper) class of eukaryotic transcription factors. The bZIP domain is 60 to 80 amino acids in length with a highly conserved DNA binding basic region and a more diversified leucine zipper dimerization region.

Leucine zipper motifs are considered a subtype of coiled coils, which are built by two or more alpha helices that are wound around each other to form a supercoil. Coiled coils contain 3- and 4-residue repeats whose hydrophobicity pattern and residue composition is compatible with the structure of amphipathic alpha-helices. The alternating three- and four-residue sequence elements constitute heptad repeats in which the amino acids are designated from a’ to g’.While residues in positions a and d are generally hydrophobic and form a zigzag pattern of knobs and holes that interlock with a similar pattern on another strand to form a tight-fitting hydrophobic core, residues in positions e and g are charged residues contributing to the electrostatic interaction.
In the case of leucine zippers, leucines are predominant at the d position of the heptad repeat. These residues pack against each other every second turn of the alpha-helices, and the hydrophobic region between two helices is completed by residues at the a positions, which are also frequently hydrophobic. They are referred to as coiled coils unless they are proven to be important for protein function. If that is the case, then they are annotated in the “domain” subsection, which would be the bZIP domain.

Two different types of such a-helices can pair up to form a heterodimeric leucine zipper. With apolar amino acid residues at either the e or g position, a heterotetramer consisting of 2 different leucine zippers can be generated in-vitro, which implies that the overall hydrophobicity of the interaction surface and van der vaals interaction may alter the organization of coiled coils and play a role in the formation of leucine zipper heterodimer.

(4) Winged helix:

Winged helix DNA-binding proteins share a related winged helix-turn-helix DNA-binding motif, where the "wings", or loops, are small beta-sheets. The winged helix motif consists of two wings (W1, W2), three alpha helices (H1, H2, H3) and three beta-sheets (S1, S2, S3) arranged in the order H1-S1-H2-H3-S2-W1-S3-W2. The DNA-recognition helix makes sequence-specific DNA contacts with the major groove of DNA, while the wings make different DNA contacts, often with the minor groove or the backbone of DNA. Several winged-helix proteins display an exposed patch of hydrophobic residues thought to mediate protein-protein interactions.
Many different proteins with diverse biological functions contain a winged helix DNA-binding domain, including transcriptional repressors such as biotin repressor, LexA repressor and the arginine repressor transcription factors such as the hepatocyte nuclear factor-3 proteins involved in cell differentiation, heat-shock transcription factor, and the general transcription factors TFIIE and TFIIF helicases such as RuvB that promotes branch migration at the Holliday junction, and CDC6 in the pre-replication complex endonucleases such as FokI and TnsA histones; and Mu transposase, where the flexible wing of the enhancer-binding domain is essential for efficient transposition. Winged helix consist of about 110 amino acids. The domain in winged-helix transcription factors (see Regulation of gene expression) has four helices and a two-strand beta-sheet. These proteins are classified into 19 families called FoxA-FoxS.

Mutations in FoxP proteins are implicated in human autoimmune diseases.

(5) Winged helix-turn-helix:

The winged helix-turn-helix (wHTH) domain SCOP 46785 is typically 85-90 amino acids long. It is formed by a 3-helical bundle and a 4-strand beta-sheet (wing).

(6) Helix-loop-helix:

The basic helix-loop-helix (bHLH) domain is found in some transcription factors and is characterized by two alpha helices (α-helixes) connected by a loop. One helix is typically smaller and due to the flexibility of the loop, allows dimerization by folding and packing against another helix. The larger helix typically contains the DNA-binding regions.

(7) HMG-box:


HMG-box domains are found in high mobility group proteins which are involved in a variety of DNA-dependent processes like replication and transcription. They also alter the flexibility of the DNA by inducing bends. The domain consists of three alpha helices separated by loops.

(8) Wor3 domain:


Wor3 domains, named after the White–Opaque Regulator 3 (Wor3) in Candida albicans arose more recently in evolutionary time than most previously described DNA-binding domains and are restricted to a small number of fungi.

(9) OB-fold domain:
The OB-fold is a small structural motif originally named for its oligonucleotide/oligosaccharide binding properties. OB-fold domains range between 70 and 150 amino acids in length. OB-folds bind single-stranded DNA, and hence are single-stranded binding proteins.

OB-fold proteins have been identified as critical for DNA replication, DNA recombination, DNA repair, transcription, translation, cold shock response, and telomere maintenance.



No comments:

Post a Comment

featured post

Are we safe with so many more covid variants!

Viruses are continually changing and mutating itself. Every time a virus replicates (makes copies of itself), it has the capacity to modify ...