Publications &



Research Interests

Structure of the Lac Repressor-DNA Complex
Flexible docking of DNA to RNA polymerase
Control of transcription by designed DNA bending drugs
Naturally Discrete Model for DNA
Theory of Elastic Rods and Its Application to DNA

Structure of the Lac Repressor-DNA Complex

One of the best understood bacterial gene regulatory systems is the lac operon of E. coli, which has been studied by genetic, biochemical, and physical methods for over 50 years. This operon is responsible for the regulation of transcription of the genes lacZ, lacY, and lacA which code for the proteins (beta-galactosidase, permease, and transacetylase) that are involved in the uptake and metabolism of lactose. In the presence of lactose and absence of glucose the expression of the lacZYA genes is activated by the CAP protein that binds to the lac promoter in the vicinity of the RNA polymerase binding site. In the absence of lactose that expression is repressed by the Lac repressor protein (LacR) which is a tetramer capable of simultaneous binding to two DNA sites. The lac promoter region contains three binding sites for LacR; the main operator O1 that is located near the start of the transcription and two auxiliary operators, O2 and O3, that lie, respectively, 410 bp downstream and 82 bp upstream of O1. Although the auxiliary operators are weak, in the sense that O2 has about 10% and O3 has only 0.3% the binding strength of O1, the presence of at least one of them in the promoter region is essential to the proper function of the repressor; the absence of both operators results in 70-fold decrease in efficiency of the repression.

Much is known about the genetics and biochemistry of the control of lacZYA expression and about the atomic-level structure of the macromolecular components of the lac system, i.e., the DNA, LacR, CAP, and the RNA polymerase. The principal open problem is that of determining the mechanisms of repression and activation of the lacZYA genes and the role of the auxiliary operators in those mechanism. It is generally accepted that the repression is the result of a competition between LacR and RNA polymerase in binding to the promoter. In addition, it is believed that the auxiliary operators help to increase the local concentration of the LacR protein near the O1 operator site, and that the DNA loop formed upon binding of LacR to two operator sites impedes the access of the CAP protein or the RNA polymerase to the promoter. I am working on the development of a comprehensive theory of the mechanisms of repression and activation of transcription of the lacZYA genes by employing a naturally discrete base-pair level model of DNA elasticity to calculate the structure of the LacR-DNA complex and numerical solutions of the stochastic differential equations governing its dynamical behavior. Recent progress is given here.

Flexible docking of DNA to RNA polymerase

In prokaryotic cells the RNA polymerase (RNAP) core enzyme is a protein complex consisting of the five subunits, aI, aII, b, b' and w. During transcription initiation the RNAP core enzyme associates with a subunit s to form the RNAP holoenzyme that is capable of sequence specific binding to DNA. The structures of the core enzyme and the holoenzyme, determined by x-ray crystallography, have provided a valuable information about the role of the s subunit in transcription initiation. However, issues such as the promoter recognition and clearance, and the mechanism and kinetics of formation of RNAP open and closed complexes, which are the steps on the transcriptional pathway that are under the influence of control proteins, cannot be solved without obtaining the structure of the RNAP holoenzyme-DNA complex.

A major problem in modeling the RNAP-DNA complex is that upon binding DNA undergoes strong sequence-dependent deformation. Such deformation can be best understood in the case of common gene promoters that are activated by the catabolite activator protein (CAP): so called class I promoters are characterized by binding of CAP at the -93, -83, -72, or -62 location of the upstream promoter region and steric contact between the aCTD domain of RNAP and CAP, while class II promoters are characterized by binding of CAP near the -41 site and steric contact between aCTD and CAP and aNTD and CAP. For both classes of promoters the CAP-RNAP-DNA complex has been characterized by biochemical and genetic methods. The model under development will account for the available high-resolution structures of individual components and subassemblies, long-range distance constraints obtained by systematic FRET measurements, short-distance constraints from crosslinking data, and contact constraints from mutational analysis.

The algorithm I am developing treats proteins as elastically deformable bodies and explicitly accounts for sequence-dependent elastic properties of nucleic acids, which is expected to improve its accuracy and reliability over classical protein docking approaches. The distance constraints are imposed as energetic penalties and the optimal configurations are located by performing a grid search to determine coarse alignment, followed by Metropolis Monte Carlo sampling of trial configurations to find the optimal configuration that best satisfies distance constraints and minimizes the elastic deformational energy of the proteins and DNA.

Control of transcription by designed DNA bending drugs

Control of gene expression is one of the key areas of interest in molecular medicine. The ability to turn genes on or off by the action of a therapeutic agent would open up the possibility of designing treatment for such major health problems as cancer, genetic diseases, or infection by antibiotic-resistant bacteria. Knowledge of the influence of DNA mechanics on transcription control presents a way of attaining this ability if one is able to construct DNA binding drugs that alter DNA structure or mechanics. Of interest are minor grove binding drugs called lexitropsins, which are polyamides composed of pyrrole, imidazole, and hydroxypyrrole compounds, that can be designed to bind specific 6.12bp long DNA sequences. Such drugs by themselves do not deform DNA upon binding. Ongoing research is conducted on the design of modified lexitropsins in which specific methyl groups are replaced by amino groups carrying a unit positive charge. The accumulation of such charges on one side of DNA is expected to lead to a partial neutralization of the negative charges of the DNA backbone and induce global bending of DNA.

The first stage of the research consists of molecular dynamics study of the effects of positive charges on the atomic-level configuration of DNA-lexitropsin complex. The study uses the OPLS/AGB/NP potential which combines a high-quality all-atom potential, an analytical version of the generalized Born electrostatic implicit-solvent model, and a non-polar-hydration-free-energy model. Once the extent of DNA deformation is determined, its effect on the efficiency of repression of the Lac operon will be determined using the naturally discrete model for DNA and verified experimentally. The major challenge of this research will be to design a drug that will maximize the desired regulatory effect given a specific architecture of the target operon.

Naturally Discrete Model for DNA

Naturally Discrete Model for DNA is a model for DNA elasticity that accounts for the dependence of the intrinsic structure and deformability of DNA on the nucleotide sequence [10]. Because the interaction of DNA with bound proteins and drugs occurs on a nucleotide level and affects the spatial location and orientation of base pairs within a binding site, the basic structural units in this naturally discrete model are taken to be base pairs. The configuration of a segment with N+1 base pairs is specified by giving, for each of its N base-pair dimers, the following kinematical variables: the tilt , the roll , and the twist which measure of the relative orientation of the n-th and (n+1)-th base pairs, and the shift , the slide , and the rise which are measures of their displacement. The elastic energy of a configuration is taken to be the sum of local energies , each of which is a quadratic function of , i.e.,

here , , XY is the nucleotide sequence (in the direction of the sequence strand) of the n-th base-pair dimer; are the elastic moduli, and are the intrinsic values of the kinematical variables. Empirical estimates of the intrinsic parameters and the moduli have been obtained from fluctuations and average values of structural parameters in crystals of pure DNA and DNA-protein complexes.

I have developed a procedure for calculating equilibrium configurations of DNA segments with specified linking number and end conditions by recursive solution of the variational equations expressing the laws of balance of forces and moments acting on the n-th base pair:

here are the force and moment exerted on the n-th base pair by the ( n+1)-th base pair, are the external force and moment acting on the n-th base pair, which may be, for example, of electrostatic origin. The solution method is based on recursive solution of the variational equations in sequence and permits one to calculate not only globally stable, but also metastable and unstable configurations. It also is faster and has a broader range of applicability than methods based on energy minimization.

I have employed the method to study equilibrium configurations of a DNA o-ring [10], which is a miniplasmid that has a stress-free configuration in which the axial curve is a circle and hence mimics the highly curved structure of the kinetoplast DNA from a trypanosome. It was found that upon binding an agent that reduces the intrinsic twist at the binding site, the o-ring folds up into a configuration that resembles a clamshell. For extreme cases of twist reduction the agent causes untwisting of DNA that localizes at a site that is antipodal to the binding site of the agent, thereby producing an action at a distance of the type suggested by spectroscopic studies of the sequence-dependent structure of DNA block copolymers. (The idealized rod model of a plasmid, in which DNA is assumed to be intrinsically straight and uniformly deformable, does not predict such localized effects.)

Theory of Elastic Rods and Its Application to DNA

I began to work in this area using the framework of the idealized rod model for DNA, in which a DNA segment is modeled by an intrinsically straight, homogeneous, inextensible rod with circular cross-sections and with elastic properties characterized by two constants: a bending modulus A and a twisting modulus C. I participated in a joint project with B. D. Coleman and I. Tobias, in which we employed available exact solutions of the equations of equilibrium for the idealized rod model to develop methods of finding the anchoring conditions to be imposed by a protein at the ends of a DNA segment in order to bring sequentially distant nucleotides into proximity [2].

In a subsequent work I used the idealized rod model to calculate the influence of DNA binding proteins on the free energy of supercoiling, or, equivalently, on the equilibrium distribution of topoisomers obtained by relaxation with topoisomerase I. In a joint paper with Coleman and Tobias [3] the problem is discussed for cases in which a small DNA plasmid is bound to a single histone octamer to form a mononucleosome and a numerical procedure is given for relating the free energy of supercoiling of a mononucleosome to the extent of wrapping of DNA about the core particle. Also it is shown that the procedure permits one to estimate the DNA-histone binding energy from measured equilibrium distributions of the linking number of miniplasmids in mononucleosomes.

In my doctoral dissertation [4] I addressed two problems in the general theory of elastic rods that had long remained open: (I) the problem of calculating equilibrium configurations of rods with impenetrability and self-contact taken into account and (II) the problem of finding useful criteria that would allow one to determine whether a calculated equilibrium configuration is stable in the sense that it gives a local minimum to the elastic energy. The availability of explicit formulae for equilibrium configurations played a key role in circumventing the difficulties encountered in earlier attempts to solve problem (I) using numerical methods that require discretization of the rod. Problem (I) was solved by using explicit solutions of the differential equations that hold in the segments that separate points of contact and choosing the integration constants in those solutions so that the end conditions and constraints implied by the impenetrability of a DNA molecule hold. Problem (II) was solved by finding conditions for the stability of an equilibrium configuration that are expressible in terms of the dependence of the excess link on writhe (a quantity describing the overall folding of a configuration) as one follows a family of equilibrium configurations with varying.

The solution of problems (I) and (II) enables one to calculate globally stable, metastable, and unstable configurations of a DNA miniplasmid with specified excess link and hence to calculate the energy barriers for transitions between stable and metastable equilibria [5] [6] [7]. It also allows one to develop methods for calculating equilibrium configurations of knotted DNA plasmids and bifurcation diagrams for linear DNA segments subject to applied tension and twisting moments, as in single-molecule manipulation experiments [8].