Course Main Page

Molecular Biology Homework

Assigned Jan. 27, due Feb. 3
1. In contrast to the differential equation and solution for bacterial growth that was presented in class and in the book, some populations, for example, number of rabbits per acre, are better described by difference equations giving the number of rabbits present in one year as a function of the number of rabbits present the previous year. The number of rabbits present in the n+1 th year, N(n+1), can be considered to be r x N(n) where r is a reproduction rate and N(n) is the number present in the nth year. This equation fails however to consider that there is a maximum population density. If the number of rabbits approaches this limit, let us say this is 1, their rate of growth is severely limited, and if the number is 1, no rabbits survive. Thus, we can write N(n+1) = r x N(n) x (1- N(n)). Determine N(n) for the first 500 years beginning from an N of 0.001 and plot N(n) for the 500 generations and for the final 50. Investigate the population behavior for values of r ranging from 0.9 to 4. If you are doing this with a spreadsheet, for the calculation of N(n+1) from N(n), refer back to a particular cell in the spreadsheet that will contain the value for r. This will enable you to easily change r and see the results on two graphs, one covering generations 1-500 and the other covering generations 450-500. Turn in graphs characteristic of the different behavior(s) you find. Comment on the behavior(s) you find.

At values of r less than one, the population dies out. From r = 1 to 2.99, the population increases to some steady state value. At r= 2.993 or so, the population increases to steady state that bounces between two values. At r= 3.44 each of the two previous values split, and at r = 3.55 each of the four values split into two giving eight values that the population successively equals. At about r = 3.8 the population numbers become chaotic with no visible repetitive pattern.

2. If the interior of a bacterial cell is pH 7, how many H+ ions at any instant are present within the cell. Comment on your answer, in particular with respect to the fact that there are a million or so molecules of protein present within a cell and they all are sensitive to pH.

pH = -log[H+]

[H+] = 10-7 mol/L = 6.02 x 1016 molecules/L = 6.02 x 1019 molecules/m3

1 E. coli cell = 1μm3 = 1x10-18m3

(6.02 x 1019 molecules/m3) x (1x10-18m3/cell) = 60.2 molecules/cell

Using 1μm3 as the volume of a bacterial cell, we find that, at pH 7, we find that there are only 60 H+ ions in this volume. Thus, each of the millions of pH sensitive proteins present, each capable only of sampling their immediate environment must be responding to the fact that there are, in fact, this number in the whole cell. The problem is resolved via the high diffusion rate; relative to the size of the cell, H+ ions are moving very fast, allowing them to collide with proteins at a high enough rate to be accurately measured.

Assigned Feb. 1, due Feb.8
1. What happens to the electrophoresis migration speed of DNA if you put more salt in the gel and the buffer bathing the gel a. if the power supply is set for constant voltage, and b. if the power supply is set for constant current?

It is the electric field acting on charges on a molecules that moves the molecule in electrophoresis. Thus, within limits, the migration rate of any molecule through a gel is proportional to the electrical charge on the molecule and the electric field that the molecule feels.

-Increasing the salt concentration of the system provides more ions, increasing conductance and decreasing resistance

a) If the power supply is set for constant voltage, the migration rate of DNA will be pretty much independent of the amount of salt in the system..

b) On the other hand, if the power supply is set for constant current, by Ohm's law, V = IR, where V=voltage, I=current, and R=resistance, increasing the salt concentration reduces R, and hence the DNA will see lower voltage and run more slowly.

2. Ethylation of a phosphate group of a base pair in a protein's DNA recognition site can block the binding of the protein. Let * represents an ethylation of a nucleotide that blocks a proteins' binding. Only ethylations if the indicated nucleotides blocked a particular protein's binding. What do you conclude?
5'-N* N* N* N N N N N N-3'
3'-M M M M M M M* M* M*5'

This pattern of interference would be seen in the case of a straight rod, i.e. and alpha helix, binding in the major groove. Even though it seems paradoxical, this interaction pattern results from an interaction from one side of the DNA. Find a graphical or physical model and convince yourself of this and then ask the question of what the interference pattern would be for an interaction in the minor groove.

Assigned Feb. 3, due Feb. 10
1. Problem 3.7 from the text. Note, I will discuss this topic further in the lecture on Feb. 8.

It takes 40 minutes to replicate the genome from beginning to end. In the case of a 40-minute doubling time, one can assume that this would not change, and polymerase will continue at the same rate, but with initiating occurring only upon completion.

2. Kinetic proofreading is a mechanism for increasing the accuracy of some biological processes. Why does or does not the 3'-5' exonucleolytic removal of an incorrectly incorporated nucleotide qualify as kinetic proofreading?

Kinetic proofreading denotes a method for error correction providing fidelity beyond that of a probabilistic model based on energy difference alone. This ensures that the correct product of a reaction is more heavily favored over incorrect ones, an is accomplished via irreversible removal of incorrect products in a process that requires the consumption of energy. In the case of 3�-5� exonuclease activity of DNA polymerase, the products in question are any incorporated base. An incorrect nucleotide, once removed, cannot be reincorporated, due to its lack of a phosphate group, and is, thus, removed from the pathway, ensuring that the correct product � in this case the proper nucleotide sequence � is formed.

Assigned Feb. 8, due Feb. 17
1. Prob. 4.2 from text.

4566 nt total from 3 ribosomal RNA subunits

10⁴ ribosomes required in 50 minutes (50 x 60 = 3000 seconds)

70 nt/s/polymerase * 3000 s * 1 ribosome / 4566 nt = 46 ribosomes/polymerase

10⁴ ribosomes * 1 polymerase / 46 ribosomes = 217.4 polymerases

2. Prob. 4.11 from text.

Three sites are involved here, two RNA polymerase binding sites, 1 and 2, and the A protein binding site. Polymerase bound to 1 blocks polymerase binding to 2. Bound A protein blocks polymerase binding to site 1, but not 2. If Site 1 binds polymerase rapidly but only very slowly dissociates or initiates, then adding polymerase before A gives a slowly increasing rate of initiations, but adding A before polymerase blocks the inhibitory site 1, and upon polymerase addition, it binds to site 2 and initiations almost immediately begin at the full rate.

Assigned Feb. 10, due Feb. 22
1. Become clear on the meanings of force, weight, and mass. Give a couple examples that enhance understanding at the molecular level of what a force of 10-15 piconewtons can and cannot do.

Force, measured in newtons (note the [lack of] capitalization � counterintuitive, I know), describes an interaction that is capable of changing the velocity of an object. This is represented by the equation F=ma; 1 newton (N) is the force necessary to accelerate 1 kg at 1 m/s². Weight is specifically the force exerted on a given object by gravity.

Some examples for reference:

4 pN �Required to break a H bond (At 0K)

5 pN � Exerted by kinesin walking on microtubule

14 pN � Exerted by RNA polymerase during transcription

160 pN � To break a typical noncovalent bond

1,600 pN � To Break a typical covalent bond

82,000 pN �Exerted on an electron in a Hydrogen atom

Assigned Feb. 17, due Feb. 29
1 and 2. Suppose you have a 200 bp piece of DNA containing a single bacterial promoter with a fluorophore on one strand in the -10 region and a fluorescence quencher opposite the fluorophore on the other strand. Suppose you have immobilized about 100,000 polymerase molecules on a glass slide in the field of view of a microscope connected to a video recorder. The slide and polymerases are at zero degrees. The DNA, also at zero degrees is flushed in at a concentration such that most polymerases bind a DNA molecule. Now, magically, you raise the temperature instantaneously to 37 degrees and begin recording the appearance of fluorescent spots as the polymerase-DNA complexes convert to open complexes. The figure below shows two potential graphs of the number of new fluorescent spots per unit time. What would each graph say about the formation of open complexes?
I think it will be beneficial for you to simulate this to gain the most understanding, but a keen understanding would allow a person to reach the correct conclusions without simulations, and I will consider (with a somewhat jaundiced eye) descriptions of how to perform analytical solutions on the data that would provide similar answers.

The relevant factor in this or any other similar process for differentiating between the two trends is the number of steps in the reaction. In a single-step process, since the rate is based entirely on the number (or concentration) of reactants � in this case, polymerases bound � the first time point will always be the highest point, since, immediately following, there will be fewer reactants. In the case of two or more steps, however, reactants first need to be converted into intermediates � the concentration of which, in this case, begins at zero � before the final product is made; thus, the rate will rise to a maximum before falling again.

Simulation Excel spreadsheet

Assigned Feb. 22, due March 2
1. and 2. See Figure 1 of Nature 461, 644-649 (2009) (This might already be familiar to you.) The figure illustrates a gene in which obligatory alternative mRNA splicing occurs. Your problem is to devise the simplest mechanism you can which will generate such obligatory alternative splicing. By the way, don't feel that anyone is restricting you to Figure 1, the rest of the paper is very interesting also.

The splicing depicted here has three (3) important elements:

1 � Only one exon each from clusters 4, 6, and 9 is used.

2 � The cell is able to identify �self.� i.e. the cell expresses only one Dscam isoform.

3 � The cell is able to identify �non-self.� i.e. each cell expresses a unique isoform.

There are multiple ways this could be accomplished. At the DNA level, each cell could randomly remove all but a single exon at an early stage, such that a single, unique transcript is always generated. One could imagine, for instance, a pair of DNA-binding proteins for each exon cluster, one slow-binding, and a second that binds quickly once the first is bound. The former could serve to designate the retained exon, with the latter blocking additional associations and/or marking other exons for removal.

Assigned Feb. 29, due March 7
1. and 2. AraC protein has an annoying property of aggregating at high protein concentrations. Are there candidate patches of hydrophobicity on the surface of AraC dimerization or DNA binding domains that could mediate such aggregation? If you don't have PyMol or VMD running on your computer you should install one of these and learn how to download a pdb file from the RCSB Protein databank and visualize and manipulate a graphical representation of a protein and visualize the domains of AraC to answer the question.

Assigned March 2, due March 9
1. What is the minimum number of tRNA's required to read all the sense codons and why doesn't wobble lead to mistaken attempts at the ribosome to initiate translation with isoleucine?

Due to wobble base-pairing, a G in the 5�-most position of the tRNA anticodon could pair with either a C or U on the mRNA. Additionally, a 5� U on the tRNA could pair with either A or G, leading to 4 x 4 x 2 = 32 possibilities, one of which would be exclusively for stop (nonsense) codons (UAA & UAG). Thus, all sense codons could be covered by 31 tRNAs.

Translation initiation requires certain initiation factors that are specific for the methionine tRNA, disallowing isoleucine tRNA from mistakenly kicking off the process. Additionally, the isoleucine tRNA may use inosine as its first anticodon base, which can wobble base-pair with A, C, or U, but � notably � not G, thus precluding it�s mistaken pairing with an AUG codon

2. Why might it be reasonable for a UGA codon to exist within the gene for the R2 release factor?

The UGA (opal) codon is specifically recognized by the R2 release factor. Thus, having a UGA within the gene allows for a self-regulating negative feedback loop. When R2 levels are low, the gene can be translated normally, but once enough R2 has been synthesized, it will terminate translation early, kpeventing too much from being made.

Assigned March 9, due March 23
1. The ligation of DNA fragments containing four base overhanging sticky ends produced by such enzymes as EcoRI used to be done by incubating 12 hours at 14 degrees. It has been found however that by adding PEG 6000 to 7.5%, the ligation can be performed in 10 minutes at 20 degrees. How does the PEG change things so dramatically?

Polyethylene glycol (PEG) 6000 is a large polymer (about 6000 daltons, hence the numerical designation). Its addition to an in vitro reaction creates a molecular crowding effect, capable of replicating the densely packed environment of the cell. This has the consequence of effectively increasing local concentrations of reactants, speeding up reactions that may otherwise take longer to initiate.

2. Suppose you find that a bacterium (not E. coli, and not suitable for transforming or for general use) contains a restriction enzyme that you would very much like to clone so you can overproduce it in E. coli for study. How would you select for a clone carrying the gene for the restriction enzyme?

The simplest selection method would likely be to infect our cells with a lytic bacteriophage containing the restriction site in an essential region. Any cells that survive would likely contain the gene of interest. This could then be verified with an in vitro reaction analyzed by gel electrophoresis.

Assigned March 21, due March 28
1., 2., and 3., The mutual information method described in PNAS 107, 9158-9163 (2010) describes what appears to be the best approach by far for the analysis of the contributions of individual nucleotides in a promoter or enhancer to overall activity. Not without good reason however, this powerful method remains almost unknown to molecular biologists--why? Also, provide the best summary that you can about how the method works.

The method reported in this paper has been largely unnoticed because it was presented in an extremely physics-centered, mathematically-intensive, and borderline impenetrable manner. Their technique involved plasmids with GFP driven by a randomly mutagenized promoter. These were transformed into E. coli cells, which were grouped by FACS according to fluorescence intensity. The plasmid groups were then deep sequenced, then the relative effect of each base on fluorescence intensity computed.

Assigned March 23, due April 6.
1., 2. What important question(s) concerning CRISPr-Cas remain unanswered, and pretty much unmentioned in publications on the system?

Assigned April 4, due April 11
1. Why, mechanistically is C⁺ dominant to C^c?

AraC can behave as both a repressor (when arabinose is absent) and an inducer (when arabinose is present). The C^c variant lacks the ability to repress, and is only capable of induction. C⁺, the properly functioning form, when coexpressed with C^c is still capable of establishing a DNA loop for repression. Note that the affinity of AraC for the O₂ site (bound only when araBAD is repressed) is 10x that of the I₂ (bound only when araBAD is induced), so a repressive complex dissociates much more slowly; �microdissociation� events of C^c from DNA will be more common than those of C⁺, allowing competing C⁺ to replace C^c much more easily � in the absence of arabinose � than vice versa.

Please note that there is an error in the book, which states that heterodimers of C^c/C⁺ may display a WT phenotype. In the years since publication, this notion has been disproven.

2. Design an experiment to determine the rate of subunit exchange in AraC.

A slight adaptation of an experiment mentioned in class would begin with a population AraC with a fluorescently labeled interdomain linker. A large excess of unlabeled dimerization domain � modified to be incapable of binding to arabinose � is then added (note, the order is important, otherwise you will have a large background and be unable to measure anything). measuring fluorescence anisotropy, a change in the tumbling rate of the fluorophore can be observed, and the rate of the total change is indicative of the subunit exchange rate.

Additionally, it is conceivable to use FRET to measure this as well, using 2 populations of AraC monomers: one tagged with a donor fluorophore, and one with an acceptor. As above, if a large excess of the latter is introduced to a pre-equilibrated population of the former (again, the order is important), it is possible to measure the formation of heterodimers.

Assigned April 6, due April 13
1., 2. What do you conclude about the structure and function of E. coli AraC after examining alignments between the AraC protein its homologs (using standard web tools)? You will need to find the sequence of AraC then do a blast search, and probably then look both at the 3D structure of AraC and some of the sequence alignments.

There are numerous molecules homologous to one or more domains of AraC. There are several proteins (such as MarA and ToxT) that have a high degree of structural similarity to the C-terminal half (or so) of AraC. These proteins are, themselves, involved in DNA binding, suggesting that this is, likewise, the function of this region of AraC.

The N-terminal portion (dimerization domain) is conserved across many even distantly related bacterial genera, while the DNA binding domain in many of these is more divergent, suggesting a common mechanism for arabinose binding and response � mediated by the dimerization domain � albeit with likely different DNA targets.

Assigned April 11, due April 18
1. Simultaneous infection of a lambda lysogen with lambda and lambda i434 produces bursts containing only lambda i434. Why? How would you isolate lambda mutants that can grow on lambda lysogens when coinfected with lambda i434, and where would you expect these to map?

2. Suppose that Int minus lambda cannot lysogenize at all. It is found however, that Mu phage can help lambda to integrate. It is also found that Mu-assisted integration of lambda is not altered if the POP' of lambda is deleted. Predict the genetic structure of Mu-assisted lambda lysogens.

Assigned April 13, due April 20
1., 2. Suppose that after considerable work you have 1 ml of a solution that can transmit scrapie to, at most, 100,000 mice (not that you would use your whole sample to do this). Using the most sensitive assays currently available, what can you say about the nucleic acid content of an infectious particle?

Ignoring for a moment that we already know that scrapie is caused by a prion, we can say in this case that there are, at most, 100,000 molecules of infectious material. Without amplification, this would be below our ability to detect nucleic acid.

Assigned April 18, due April 25
1. Why does it make sense for transposable elements to make staggered nicks rather than opposed nicks in the process of transposition?

2. What fraction of the human genome seems to be insertion sequences or transposable elements? How do you reconcile this number with claims of the ENCODE project?