Math 3380:      Mathematics in Molecular Biology      

Fall 2005

Instructor: David Swigon

Office: Thackeray 519, 412-624-4689, swigon@pitt.edu

Lectures:  TTh 2:30 – 3:45pm (NEW), Thack 524

Office Hours: By appointment.

Course Web Page: (check frequently for changes and updates)

http://www.math.pitt.edu/~swigon/math3380.html

Course Description

The course will present an overview of mathematical models and techniques used in molecular biology, with the focus on DNA and protein sequence analysis, atomic level molecular modeling and genetic network modeling. It is recommended for graduate students in mathematical or computational biology, and also students in discrete and computational mathematics who want to learn about biological applications.  No background in molecular biology or chemistry necessary.

Prerequisites

Undergraduate level probability theory and differential equations. 

Textbook

There is no required textbook for the course.  We will cover selected material from the following books:

[DEKM]   Durbin, Eddy, Krogh, & Mitchison, Biological sequence analysis, Cambridge University Press 1999, ISBN: 0521629713

[EG]           Ewens & Grant, Statistical Methods in Bioinformatics, Springer 2001, ISBN: 0387952292

[S]             Schlick, Molecular modeling and simulation, Springer 2002, ISBN: 038795404X

Other relevant reading material and journal articles will be distributed during the course.

Grading Scheme

Homework Assignments: 1/3

Project & presentations: 1/3

Final Exam: 1/3

Schedule

The course will be divided into three segments: Biological sequence analysis, Molecular modeling, and Genetic network analysis. We will meet two times a week for 1.5 hours in the classroom and occasionally in a computer lab.

For the term project each student will be assigned a particular gene and his task will be to perform on that gene several analyses corresponding to the topics covered in the course, such as the sequence alignment with related genes from related organisms, evolutionary comparison, structural analysis of the corresponding protein, energy minimization of atomic level structure, molecular dynamics, genetic network analysis.

Student presentations on databases

Biological databases are essential sources of information to everyone doing modeling in molecular biology. To get an overview about various databases available, every week or so a pair of students will give ~10 min presentation about one of the following: NCBI, Genbank, Swissprot, KEGG, PDB & NDB, DNA microarrays, etc.

Note

This is a special topics course in applied mathematics which is intended to give you an introduction to mathematics and algorithms used in molecular biology.  Emphasis will be placed on hands-on approach and problem solving.  No formal theorems and proofs will be given.

Syllabus

Date

Covers

Topic

Homework 

Notes

Further  material

Aug 30

 

Introduction

 

Introduction

 

Sep 1

5.2-5.9 of [EG]

Single sequence analysis

 

Lecture1

[1]

Sep 6

2.1-2.9 of [DEKM]

or 6.1-6.5 of [EG]

Sequence alignment

 

Lecture2

[2]

Biosequence analysis

Sep 8

9.1-9.5 of [EG]

Sequence alignment, BLAST

HW#1

Lecture3

[3],[4]

BLAST

Sep 13

3.1-3.3 of [DEKM]

or 11.1-11.2 of [EG]

Hidden Markov Models

 

Lecture4

CpG island finder

Sep 15

11.3 of [EG]

Gene finding Alignment using HMM,

 

 

[5]

GENSCAN

Sep 20

6.1-6.5 of [DEKM]

Multiple sequence alignments, CLUSTALW

 

Lecture5

 

Sep 22

7.1-7.4 of [DEKM]

14.1-14.6 of [EG]

Phylogenetics – building of trees

 

Lecture6

Construction of phylogenetic trees

Sep 27

8.1-8.6 of [DEKM]

Probabilistic evolutionary models

 

Lecture7

 PHYLIP

Sep 29

 

Review of sequence analysis

HW#2

Bring Lecture notes 5, 7

 

Oct 4

3.1-3.4 of [S]

Biological macromolecules, proteins and DNA

 

Lecture8

 

 

Oct 6

5.1-5.3 of [S]

Nucleic acid structure

 

Lecture9

 

Oct 11

4.1-4.10 of [S]

Protein structure, PDB

 

Lecture9 addendum

 

Oct 13

7.3-8.8 of [S]

Molecular forces

 

Lecture10

NEW

 

Oct 18

9.1-9.5 of [S]

Algorithms in molecular mechanics

 

Bring Lecture10

[6], [7]

Oct 20

10 of [S]

Problems of Molecular mechanics

Energy minimization algorithms

 

Lecture 11

[8]

Oct 27

11 of [S]

Monte Carlo sampling

 

Lecture12

[9]

Nov 1

12 of [S]

Molecular dynamics algorithms

HW#3

Lecture13

[10], [11],[12]

Nov 3

 

DNA topology

 

Lecture14

[13]-[16]

Nov 10

 

DNA mechanics and statistical mechanics

 

Lecture15

[17]-[20]

Nov 15

 

Metabolic networks

 

Lecture16

[21]-[24]

Nov 17

 

Metabolic networks

 

 

 

Nov 22

 

Systems biology of cells,

Genetic networks

 

Lecture17

[25]-[26]

Nov 29

 

Bayesian models

 

Lecture18

[27]-[30]

Dec 1

 

Nonlinear ODE models

 

Lecture19

 

Dec 6

 

Monotone systems

 

 

[31]-[33]

Dec 8

 

Boolean and logic network models

 

Lecture20

[34]-[37]

Dec 13

 

Stochastic effects in gene expression

 

Lecture21

[38]-[40]

Dec 15

 

Stochastic simulation

Exam

Lecture22

[41]-[43]

 

Related Papers

[1] Stormo, DNA binding sites: representation and discovery. Bioinformatics 16 (2000)

[2] Henikoff & Henikoff, Amino acid substitution matrices from protein blocks.PNAS 89 (1992): 10915-10919

[3] Altschul et al., Basic local alignment search tool. JMB 215 (1990)

[4] Karlin & Altschul, Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes, PNAS 87 (1990) 2264-8.

[5] Burge & Karlin, Prediction of complete gene structures in human genomic DNA, JMB 268 (1997)

[6] Karplus & McCammon, Molecular dynamics simulation of macromolecules, Nature Struct. Bio. (2002)

[7] Sagui & Darden, Molecular Dynamics Simulations Of Biomolecules: Long-Range Electrostatic Effects, Annu. Rev. Biophys. Biomol. Struct. (1999)

[8] Nocedal, Theory of Algorithms for Unconstrained Optimization, Acta Numerica (1992)

[9] Metropolis et al., Equation of state calculation by fast computing machines, JCP (1953)

[10] Skeel, Zhang, & Schlick, A family of symplectic integrators, SIAM J. Sci. Comput. (1997)

[11] Schlick et al, Algorithmic challenges in computational molecular biophysics, JCP (1999)

[12] Schlick et al., Biomolecular dynamics at long timesteps, Ann. Rev. Biophys. Biomol. Struct. (1997)

[13] Fuller, The writhing number of a space curve, PNAS (1971)

[14] Vologodskii, Topology and physics of circular DNA (1992)

[15] Pohl, DNA and differential geometry, Math. Intelligencer (1980)

[16] White, Introduction to the geometry and topology of DNA structure, Mathematical methods for DNA sequences, CRC (1989)

[17] Coleman & Swigon, Theory of Supercoiled Elastic Rings with Self-Contact and Its Application to DNA Plasmids, J. Elasticity (2000)

[18] Bustamante et al., Single molecule studies of DNA mechanics, Cur. Opin. Struct. Bio. (2000)

[19] Charvin et al., Twisting DNA: Single molecule studies, Contemporary Physics (2004)

[20] Rybenkov et al., Simplification of DNA Topology Below Equilibrium Values by Type II Topoisomerases, Science (1997)

[21] Feinberg, The existence and uniqueness of steady states for a class of chemical reaction networks, Arch. Rational Mech. Anal. (1995)

[22] Hofmeyr, Metabolic control analysis in a nutshell (2001)

[23] Vo et al.

, Reconstruction and Functional Characterization of the Human Mitochondrial Metabolic Network Based on Proteomic and Biochemical Data, JBC (2004)

[24] Gunawardena, Notes on Metabolic Control Analysis, (2002)

[25] de Jong, Modeling and Simulation of Genetic Regulatory Systems: A Literature Review, J. Comp. Biol. (2002)

[26] Smolen, Modeling transcriptional control in gene networks: Methods, recent results, and future directions, Bull. Math. Biol. (2000)

[27] Friedman et al. Using Bayesian networks to analyze expression data, JCB (2000)

[28] Yuh et al, Cis-regulatory logic in the endo16gene: switching from a specification to a differentiation mode of control, Development (2001).

[29] Ideker et al., Integrated Genomic and Proteomic Analyses of a Systematically Perturbed Metabolic Network, Science, (2001)

[30] Huang, Gene expression profiling, genetic networks, and cellular states, JMM (1999).

[31] Sontag, Molecular Systems Biology and Control, (2005)

[32] Enciso & Sontag, On the stability of testosterone dynamics, JMB (2004)

[33] Smith, Systems of ordinary differential equations which generate an order preserving flow, SIAM Review (1988).

[34] Edwards, Analysis of continuous-time switching networks, Physica D (2000).

[35] Mestl et al., Chaos in high-dimensional neural and gene networks, Physica D (1996).

[36] De Jong et al., Qualitative simulation of genetic regulatory networks using piecewise-linear models, BMB (2003).

[37] Thomas, Multistationarity, the basis of cell differentiation and memory II., Chaos (2001).

[38] Thattai & van Oudenaarden, Intrinsic noise in gene regulatory networks, PNAS (2001)

[39] Kepler & Elston, Stochasticity in transcription regulation, Biophys.J., (2001)

[40] Rao et al., Control, exploitation and tolerance of intracellular noise, Nature (2002)

[41] Gillespie, Exact stochastic simulation of coupled chemical reactions, JPC (1977)

[42] Cao et al., The numerical stability of leaping methods for stochastic simulation of chemically reacting systems, JCP (2004)

[43] Vilar et al., Mechanisms of noise-resistance in genetic oscillators, PNAS (2001)

Other resources

Online Texts

Rudimentary introduction to DNA and molecular biology:

Molecular Biology and Genetics Primer for Mathematicians

DNA from the beginning

Molecular genetics

Ron Shamir’s Course notes on sequence analysis, pattern searching, and hidden Markov models

Books

For a standard introduction to molecular biology I recommend the following undergraduate textbook:

Lodish, Berk, Zipursky, Matsudaira, Baltimore, Darnell, Molecular cell biology, W. H. Freeman & Co., 2000

Free textbook on molecular dynamics (Google account required):

Rapaport & Rapaport, Art of Molecular Dynamics Simulation

Databases

NCBI (National Center for Biotechnology Information) includes GenBank and many other databases

PDB (Protein Data Bank)

NDB (Nucleic Acids Database)

SwissProt

KEGG (Kyoto Encyclopedia of Genes and Genomes)

RegulonDB (Transcription regulation in E. coli)

SBML (Systems Biology Markup Language)

Software

Sequence alignment

ClustalW

BCM multiple sequence alignments

USC sequence alignment server

Biosequence analysis– complements Durbin et al. book, JAVA & applet codes

Hidden Markov Models for sequence analysis

HMMER

Gene finding

GENSCAN

Phylogenetic analysis

Construction of phylogenetic trees

PHYLIP

Molecular Visualization and Modeling

RasMol - basic tool used by majority, small code, fast

PyMol - Python based, excellent graphics and visualization options

Insight II (commercial)

Gromacs

TINKER

Genetic Networks

BioNetS (Biochemical Network Stochastic Simulator)

XPPAUT (General purpose ODE solver)

DESS (Ordinary differential equations solver - Applet)