Home |
Vita |
Publications & |
Research |
Molecular |
| Structure of the Lac Repressor-DNA Complex |
| Flexible docking of DNA to RNA polymerase |
| Control of transcription by designed DNA bending drugs |
| Naturally Discrete Model for DNA |
| Theory of Elastic Rods and Its Application to DNA |
One of the best understood bacterial gene regulatory systems is the lac operon of E. coli, which has been studied by genetic, biochemical, and physical methods for over 50 years. This operon is responsible for the regulation of transcription of the genes lacZ, lacY, and lacA which code for the proteins (beta-galactosidase, permease, and transacetylase) that are involved in the uptake and metabolism of lactose. In the presence of lactose and absence of glucose the expression of the lacZYA genes is activated by the CAP protein that binds to the lac promoter in the vicinity of the RNA polymerase binding site. In the absence of lactose that expression is repressed by the Lac repressor protein (LacR) which is a tetramer capable of simultaneous binding to two DNA sites. The lac promoter region contains three binding sites for LacR; the main operator O1 that is located near the start of the transcription and two auxiliary operators, O2 and O3, that lie, respectively, 410 bp downstream and 82 bp upstream of O1. Although the auxiliary operators are weak, in the sense that O2 has about 10% and O3 has only 0.3% the binding strength of O1, the presence of at least one of them in the promoter region is essential to the proper function of the repressor; the absence of both operators results in 70-fold decrease in efficiency of the repression.
Much is known about the genetics and biochemistry of the control of lacZYA expression and about the atomic-level structure of the macromolecular components of the lac system, i.e., the DNA, LacR, CAP, and the RNA polymerase. The principal open problem is that of determining the mechanisms of repression and activation of the lacZYA genes and the role of the auxiliary operators in those mechanism. It is generally accepted that the repression is the result of a competition between LacR and RNA polymerase in binding to the promoter. In addition, it is believed that the auxiliary operators help to increase the local concentration of the LacR protein near the O1 operator site, and that the DNA loop formed upon binding of LacR to two operator sites impedes the access of the CAP protein or the RNA polymerase to the promoter. I am working on the development of a comprehensive theory of the mechanisms of repression and activation of transcription of the lacZYA genes by employing a naturally discrete base-pair level model of DNA elasticity to calculate the structure of the LacR-DNA complex and numerical solutions of the stochastic differential equations governing its dynamical behavior. Recent progress is given here.
In prokaryotic cells the RNA polymerase (RNAP) core enzyme is a protein complex consisting of the five subunits, aI, aII, b, b' and w. During transcription initiation the RNAP core enzyme associates with a subunit s to form the RNAP holoenzyme that is capable of sequence specific binding to DNA. The structures of the core enzyme and the holoenzyme, determined by x-ray crystallography, have provided a valuable information about the role of the s subunit in transcription initiation. However, issues such as the promoter recognition and clearance, and the mechanism and kinetics of formation of RNAP open and closed complexes, which are the steps on the transcriptional pathway that are under the influence of control proteins, cannot be solved without obtaining the structure of the RNAP holoenzyme-DNA complex.
A major problem in modeling the RNAP-DNA complex is that upon binding DNA undergoes strong sequence-dependent deformation. Such deformation can be best understood in the case of common gene promoters that are activated by the catabolite activator protein (CAP): so called class I promoters are characterized by binding of CAP at the -93, -83, -72, or -62 location of the upstream promoter region and steric contact between the aCTD domain of RNAP and CAP, while class II promoters are characterized by binding of CAP near the -41 site and steric contact between aCTD and CAP and aNTD and CAP. For both classes of promoters the CAP-RNAP-DNA complex has been characterized by biochemical and genetic methods. The model under development will account for the available high-resolution structures of individual components and subassemblies, long-range distance constraints obtained by systematic FRET measurements, short-distance constraints from crosslinking data, and contact constraints from mutational analysis.
The algorithm I am developing treats proteins as elastically deformable bodies and explicitly accounts for sequence-dependent elastic properties of nucleic acids, which is expected to improve its accuracy and reliability over classical protein docking approaches. The distance constraints are imposed as energetic penalties and the optimal configurations are located by performing a grid search to determine coarse alignment, followed by Metropolis Monte Carlo sampling of trial configurations to find the optimal configuration that best satisfies distance constraints and minimizes the elastic deformational energy of the proteins and DNA.
Control of gene expression is one of the key areas of interest in molecular medicine. The ability to turn genes on or off by the action of a therapeutic agent would open up the possibility of designing treatment for such major health problems as cancer, genetic diseases, or infection by antibiotic-resistant bacteria. Knowledge of the influence of DNA mechanics on transcription control presents a way of attaining this ability if one is able to construct DNA binding drugs that alter DNA structure or mechanics. Of interest are minor grove binding drugs called lexitropsins, which are polyamides composed of pyrrole, imidazole, and hydroxypyrrole compounds, that can be designed to bind specific 6.12bp long DNA sequences. Such drugs by themselves do not deform DNA upon binding. Ongoing research is conducted on the design of modified lexitropsins in which specific methyl groups are replaced by amino groups carrying a unit positive charge. The accumulation of such charges on one side of DNA is expected to lead to a partial neutralization of the negative charges of the DNA backbone and induce global bending of DNA.
The first stage of the research consists of molecular dynamics study of
the effects of positive charges on the atomic-level configuration of
DNA-lexitropsin complex. The study uses the OPLS/AGB/NP potential
which combines a high-quality all-atom potential, an analytical version
of the generalized Born electrostatic implicit-solvent model, and a
non-polar-hydration-free-energy model. Once the extent of DNA
deformation is determined, its effect on the efficiency of repression
of the Lac operon will be determined using the naturally discrete
model for DNA and verified experimentally. The major challenge of this
research will be to design a drug that will maximize the desired
regulatory effect given a specific architecture of the target operon.
Naturally Discrete Model for DNA is a model for DNA elasticity that
accounts for the dependence of the intrinsic structure and
deformability of DNA on the nucleotide sequence [10]. Because the
interaction of DNA with bound proteins and drugs occurs on a nucleotide
level and affects the spatial location and orientation of base pairs
within a binding site, the basic structural units in this naturally
discrete model are taken to be base pairs. The configuration of a
segment with N+1 base pairs is specified by giving, for each of its N
base-pair dimers, the following kinematical variables: the tilt
, the
roll
, and the twist
which measure of the relative
orientation of the n-th and (n+1)-th base pairs, and the
shift
, the slide
, and the rise
which
are measures of their displacement. The elastic energy
of a
configuration is taken to be the sum of local energies
, each of
which is a quadratic function of
, i.e.,
here
,
, XY is the nucleotide sequence (in the direction of
the sequence strand) of the n-th base-pair dimer;
are the
elastic moduli, and
are the intrinsic values of the kinematical variables.
Empirical estimates of the intrinsic parameters and the moduli have
been obtained from fluctuations and average values of structural
parameters in crystals of pure DNA and DNA-protein complexes.
I have developed a procedure for calculating equilibrium
configurations of DNA segments with specified linking number and end
conditions by recursive solution of the variational equations
expressing the laws of balance of forces and moments acting on the n-th
base pair:
here
are the
force and moment exerted on the n-th base pair by the (
n+1)-th base pair,
are the external force and moment acting on
the n-th base pair, which may be, for example, of electrostatic
origin. The solution method is based on recursive solution of the
variational equations in sequence and permits one to calculate not only
globally stable, but also metastable and unstable configurations. It
also is faster and has a broader range of applicability than methods
based on energy minimization.
I have employed the method to study equilibrium configurations of a DNA o-ring [10], which is a miniplasmid that has a stress-free configuration in which the axial curve is a circle and hence mimics the highly curved structure of the kinetoplast DNA from a trypanosome. It was found that upon binding an agent that reduces the intrinsic twist at the binding site, the o-ring folds up into a configuration that resembles a clamshell. For extreme cases of twist reduction the agent causes untwisting of DNA that localizes at a site that is antipodal to the binding site of the agent, thereby producing an action at a distance of the type suggested by spectroscopic studies of the sequence-dependent structure of DNA block copolymers. (The idealized rod model of a plasmid, in which DNA is assumed to be intrinsically straight and uniformly deformable, does not predict such localized effects.)
I began to work in this area using the framework of the idealized rod model for DNA, in which a DNA segment is modeled by an intrinsically straight, homogeneous, inextensible rod with circular cross-sections and with elastic properties characterized by two constants: a bending modulus A and a twisting modulus C. I participated in a joint project with B. D. Coleman and I. Tobias, in which we employed available exact solutions of the equations of equilibrium for the idealized rod model to develop methods of finding the anchoring conditions to be imposed by a protein at the ends of a DNA segment in order to bring sequentially distant nucleotides into proximity [2].
In a subsequent work I used the idealized rod model to calculate the influence of DNA binding proteins on the free energy of supercoiling, or, equivalently, on the equilibrium distribution of topoisomers obtained by relaxation with topoisomerase I. In a joint paper with Coleman and Tobias [3] the problem is discussed for cases in which a small DNA plasmid is bound to a single histone octamer to form a mononucleosome and a numerical procedure is given for relating the free energy of supercoiling of a mononucleosome to the extent of wrapping of DNA about the core particle. Also it is shown that the procedure permits one to estimate the DNA-histone binding energy from measured equilibrium distributions of the linking number of miniplasmids in mononucleosomes.
In my doctoral dissertation [4] I addressed two problems in the general theory of elastic rods that had long remained open: (I) the problem of calculating equilibrium configurations of rods with impenetrability and self-contact taken into account and (II) the problem of finding useful criteria that would allow one to determine whether a calculated equilibrium configuration is stable in the sense that it gives a local minimum to the elastic energy. The availability of explicit formulae for equilibrium configurations played a key role in circumventing the difficulties encountered in earlier attempts to solve problem (I) using numerical methods that require discretization of the rod. Problem (I) was solved by using explicit solutions of the differential equations that hold in the segments that separate points of contact and choosing the integration constants in those solutions so that the end conditions and constraints implied by the impenetrability of a DNA molecule hold. Problem (II) was solved by finding conditions for the stability of an equilibrium configuration that are expressible in terms of the dependence of the excess link on writhe (a quantity describing the overall folding of a configuration) as one follows a family of equilibrium configurations with varying.
The solution of problems (I) and (II) enables one to calculate globally stable, metastable, and unstable configurations of a DNA miniplasmid with specified excess link and hence to calculate the energy barriers for transitions between stable and metastable equilibria [5] [6] [7]. It also allows one to develop methods for calculating equilibrium configurations of knotted DNA plasmids and bifurcation diagrams for linear DNA segments subject to applied tension and twisting moments, as in single-molecule manipulation experiments [8].