Basic Genetics for University: Molecular Genetics

Molecular Genetics

This covers the molecular parts of genetics and is consolidated from individual posts from my other blog on transcription and translation. But only the tip of the iceberg! Remember that "form fits function". The structure of DNA (two molecules of polynucleotide which are joined using hydrogen bonds) was the first clue as to the mechanism of DNA replication. And once the mechanism of replication was worked out, more ideas about how DNA functions were generated.

DNA Structure

Here's an important skill: you should be able to draw and number a pentose sugar. The bases are a bit harder to draw - they have all kinds of functional groups and there are two fairly different structures (purines and pyrimidines).

You should for sure recognize a purine and know there are two types in nucleic acids (it's the double-ring structure and consists of A or G). You should also recognize pyrimidines and know which nucleotides fall into this category (single-ring with C and either T or U, depending on whether it's DNA or RNA).

Each nucleotide has the nitrogenous base attached to the 1' carbon of the pentose sugar. And you number the carbons sequentially from that starting point.

So here: I'll leave you with a challenge. Draw a pentose sugar attached to a purine (don't sweat the purine details too much... just make it distinguishable from a pyrimidine). Be able to alter your figure to depict a ribose, a deoxyribose, and that really strange entity used for Sanger sequencing: a dideoxyribose. If you can, depict where the phosphates fit on the structure, and then number the pentose carbons according to standard conventions.

Here's how I'd do it (quick'n'dirty 3 minute video):

Bidirectional Replication

Here's a quick video (~12 min) regarding the replication fork and DNA synthesis. It is done by Dr. Bird.

Transcription

Here are some copyright-free lessons I created. Anyone can redistribute or remix as long as I'm identified as the author and the images I used retain their author identifications.

Part 1 - Overview

Part 2 - A little more detail

Translation

Now that we've got transcription under our belts, we can now look at translation. Translation is where the information that was stored in genes (DNA) is used to create a protein. Note that the DNA information had to be copied (transcribed) into a new molecule: RNA. It's very common for students to be unclear about what processes occur during transcription and which ones are translation.

When you're talking about promoters and terminators, enhancers, silencers, and RNA polymerase, you're talking about transcription. In eukaryotes, RNA splicing, mG capping, and the poly-A tail are all involved in modifications of the transcript (RNA strand). These events occur BEFORE translation.

Note that only when we're dealing with translation do we start to worry about start codon and stop codons. Most people have dealt with codons and anticodons in their introductory biology classes, but in my lecture below I look them over again quickly. Note that I'm adding a little more complexity to these introductory concepts: I want everyone to know that there's special parts of mRNA - the 5'UTR and the 3'UTR (untranslated regions). The borders of these are defined by the start and stop codons.

In translation, there's also a set of steps to getting the ribosome to load unto the mRNA. The ribosome is made of rRNA and protein. The small subunit in prokaryotes has a piece of rRNA that's complementary to the 5'UTR upstream of the start codon: the Shine-Dalgarno box. Eukaryotes have something similar (the Kozak sequence, but which has the start codon embedded within the consensus sequence). After the small subunit has bound, the first tRNA with its amino acid (methionine in eukaryotes, formylmethionine in prokaryotes) will bind to the start codon, then the large subunit attaches.

Prokaryotes also transcribe and translate their genes simultaneously. Without a nuclear envelope to partition the RNA polymerase in one compartment (the nucleus) and the ribosomes in another (the cytoplasm) we can see transcription of mRNA that occurs even as it's being manufactured! Eukaryotes have spacial separation, which may be why they are able to modify their mRNA so extensively before translation.

So, without more buildup, here are the video lectures for "translation"!

Part 1: Structure of the ribosome

Part 2: Using the Code

Part 3: Cracking the Code

Part 4: Posttranslational Modification and Shipping

Find the Intron

Here's a chance for you to try your hand at gene expression. In this exercise, you're given a piece of DNA and you're told that it encodes an mRNA that has a single intron. You need to transcribe and translate it, and you also need to use protein information to determine where the intron is.

You can access this question in a Word document online.

Find the intron:

You’ve cloned and sequenced a piece of DNA and you know that it contains a gene for the protein you’ve been working to characterize. The protein contains the amino acid sequence:

leu-pro-trp-ala-gly

The sequenced DNA reads as:

5’GAGCATCCCAGAGGAGGAGATGACACTCCCATGTCCACATGATTACGCAAGGGGCCGGTGGGTAATCGCATACGATTACC3’

3’CTCGTAGGGTCTCCTCCTCTACTGTGAGGGTACAGGTGTACTAATGCGTTCCCCGGCCACCCATTAGCGTATGCTAATGG5’

Write out the full pre-mRNA, including identifying the ends. Assume that your sequenced DNA would have had the promoter to the left and that the first nucleotide is at the -3 position. Circle the intron in your mRNA. Write out the full polypeptide, and label its ends as well.

Try the question yourself before accessing the YouTube solution below:

(You can get a larger image by clicking on the YouTube logo in the bottom-right hand corner of the video above).

If you want to try our a game that creates a new intron puzzle each time you run it, go to http://www2.mtroyal.ca/~tnickle/2101/cgi-shl/sampleFindIntron.cgi

Operons

This is a tricky section for a lot of reasons. First, it's critical that students understand the functions of all the parts of genes. The promoter is, of course, where RNA polymerase binds (see previous lectures). The polymerase makes RNA. In the case of gene regulation, it's reasonable to assume that the RNA created is mRNA. Prokaryotes can also take advantage of polycistronic mRNA - multiple reading frames in a single molecule of mRNA. Ribosomes latch on with a Shine-Dalgarno sequence, and move from 5' to 3', making polypeptides.

Knowing that, think of the possibilities! If you control the mRNA, you control how much protein can be made. This is a good example of gene regulation.

Making molecules is expensive

For prokaryotes, it's most efficient to control gene regulation by stopping transcription. If you don't make RNA for a protein, you save a lot of cellular energy. Each nucleotide consumes one ATP equivalent (the ATP, GTP, CTP and UTP are all similar in energy profiles). And then there's the maintenance! It takes more energy to make an ATP than you get out of it. Making RNA is expensive. You might think ALL cells should use this type of regulation!

Cells are like cities - particularly eukaryotic cells

Keep in mind that it's also expensive to make a polypeptide. GTP pushes the ribosome along the mRNA. ATP charges each tRNA with its appropriate amino acid. Making the amino acids takes energy, as does recycling of unused proteins (and of course their polypeptides). Even if you use post-transcriptional regulation (you regulate even having MADE the polypeptide), you can still save some energy.

The bottom line is: Don't make things you don't need.

Why not just use transcriptional regulation? Let's look at how Eukaryotes often have to manage their multicellular situations. You might have a tissue that has the job of secreting a hormone to signal a dramatic change with little warning. You can't do this well with transcriptional regulation.

I sometimes make an analogy of the cell being like a small city. The nucleus can be city hall: it contains plans and directions for growth and controlling how things get done. The mitochondria are the powerhouses. Lysosomes are the custodians, breaking down and cleaning up outdated or unnecessary parts. What about the police? Or the fire department? You don't want to start recruiting, training, and equipping either of these on short notice. If you start to look for people who would make good firemen when there's a building that's just started on fire, you'd be months late to do anything about it. Post-translation processing - like what happens with insulin - make the response time dramatically shorter.

You can see why it might be useful to have different levels of gene control at hand.

Back to Bacteria

This section is about how prokaryotes do something eukaryotes do not: they use operons. I think that if I do my job as an educator well, you'll be disappointed that you, as a eukaryote, don't do this yourself. Operons are like a chandelier. You turn it on to impress people. You don't turn on HALF the bulbs ... you turn them all on. You have one light switch, which is elegant, beautiful, and obvious. You wouldn't have one switch for every bulb in the chandelier.

Can I throw another analogy at you? This one is an assembly line. Each job is done by one worker. A biochemical pathway is like that. Each enzyme is, by nature, able to do only one job. If you are taking some reactant and turning it into a product in four steps, you ALWAYS need all four enzymes. You can never "make do" with just three. If one worker is sick one day, the other three will only cost the company money in wages for work that can't get completed.

If you put all the genes on a single mRNA, you get all the workers. You don't even need the workers arranged in a particular order: you just need them linked by an "on switch". For RNA (i.e. transcriptional control) that "on switch" is the promoter.

Note that you get four DIFFERENT polypeptides from the same mRNA strand. They have different amino acid sequences, and therefore different functions.

Enter the operator

The operator comes between the promoter (or is a part of the promoter) and the +1 nucleotide in the mRNA. Such an arrangement makes up an "operon". The operator can bind to a protein to prevent synthesis of the mRNA. This is negative regulation, and you'll see more about it in the video that follows.

The protein that binds to the operator is called the "repressor". It can sense whether you want to transcribe the operon or not. The way it senses the conditions is by having two important sites: one site is the DNA binding domain, which is where it contacts the DNA. The other site is where it binds to some other molecule. Binding the other molecule changes the shape of the repressor. One conformation is that the repressor can bind DNA at the operator. The other conformation means it cannot. This shape alteration is called an allosteric shape change.

To see how this works in practice, watch the video lecture below:

There's also positive regulation.

Don't always be so negative.

The bacterium has a vested interest of making the best use of its resources. Sure, it's great to have the
ability to break down lactose, but what are the end products? Glucose and galactose - and the latter is an isomer of the former. Galactose is quickly converted into glucose, and then is processed through glycolysis. So, does it make sense to make the enzymes to break down lactose if you already have glucose? Remember - making enzymes involves transcription and then translation, which are energetically expensive!

The answer is no. And the cool thing - to me - is that bacteria figured this out by themselves, and they don't even have a brain or reasoning power! Man, but evolution is cool.

Bacteria sense the amount of glucose by watching the product of another chemical reaction. The enzyme adenylate cyclase converts ATP into cAMP (cyclic adenosine monophosphate). The cAMP acts as a messenger to the cell ... it says "I'm starving". How does it do that? It turns out that when glucose levels are sufficient, adenylate cyclase activity is inhibited. Glucose interferes with the production of cAMP. Thus, when glucose is high, cAMP is low and vice-versa. cAMP activates the protein CAP (sometimes called CRP, for "cAMP Receptor Protein"). When CAP-cAMP binds to the promoter of the lac operon, it revs up transcription to a very high level. If lactose is absent, repression prevents transcription ... but if lactose is present, the receptor protein is deactivated and you get tonnes of permease and beta-galactosidase (and transacetylase, whatever that's for ...).

More Operon Exercises

Here's a worksheet to practice your skills with gene regulation in prokaryotes using the lac operon.
You can download the worksheet as a Word document online.

Here are the solutions! You can make them larger by going right to YouTube (click its icon at the lower-right side of each video). There is one video for each of the three stages above: Basic, with a Twist, and Sophisticated.
Basic

With a Twist

Sophisticated

Sanders Question 18 (Chapter 14)

Prokaryotes often cluster the information for proteins involved in the same biochemical pathway all together. Each protein is encoded by what we call and "ORF" each of which has its own start and stop codon. The old term for an ORF was a "cistron", and for that reason we call the mRNA that contains several ORFs as "polycistronic mRNA". It's a long mRNA that a single ribosome will cruise along, creating the various proteins one after the other.

I made a video some years ago that shows this. It's in a copyright-protected part of my website, so you have to login as "biostudent" and use the password "science". The URL is .http://www2.mtroyal.ca/~tnickle/Animations/Lac-operon.html and you can see this process for yourself.

For the purpose of this exercise, you can ignore positive regulation (anything involving the CAP/cAMP complex). Your task is to figure out whether B-galactosidase and permease (proteins from the lacZ and lacY genes, respectively) are likely to be produced by a cell in an environment where lactose is present or not present. This question mirrors the one in the Sanders & Bowman Genetic Analysis: an integrated approach (1st edition) chapter 14, question 18. Sadly, the question at the back of the textbook has an incorrect example. The one below has been corrected, and I'll explain why this one is correct when I show the answer.

Thanks to my super-amazing daughter, I was able to reproduce this table from the one that we handed out in lecture earlier (she did the typing 'cause my colleague Dr. Bird lost the original file!). Here's an important note: my other tutorials take into account the "polar effect", but this one doesn't. In my other tutorials, having lacZ- means the ribosome falls off before hitting the lacY ORF. In this case, you can assume that you can express permease even if lacZ isn't synthesized.

Here's the YouTube solution!

Fusion Operons
What's a fusion operon? Ask Dr. David Bird, an evil scientist who likes to test the understanding of his students regarding gene regulation by creating unholy and absurd mixtures of operons. Well, that's overstating it, but it's a neat puzzle that is created by joining control structures from different operons together.

In this case, the CAP site and operator of a lac operon were fused to the trpL and structural genes of a trp operon. Remember that the CAP site is for positive regulation, the lacO is bound by a repressor protein (it's not in this figure, but imagine it's there) in the absence of lactose. The trpL is the leader sequence that allows attenuation (premature end of transcription unless tryptophan levels are really low - it does this by making a hairpin structure that results in rho-independent termination of transcription).

See if you can fill out this table using those cues. Oh, and thanks go to Dr. Bird for his generously supplying me with this example problem (and Brittany for reminding me about this exercise)!

The solution:

Restriction Mapping

I embedded a code incorrectly in my video, so here's a replication of the Restriction Mapping question that's also found in DNA Technology. Sorry for the cross-post. Here's more practice for you to use to create a restriction map. Dr. Bird did another post on this type of exercise.

You can assume the plasmid is circular. Note that there is more than one solution to this exercise!

... and here's the way to find the solutions!

Meiosis with an Inversion

This is usually dealt with at the end of my course, but here's a question that shows how meiosis can be side-tracked by chromosomal abnormalities. In this case, you have to deal with an inversion. Here's a typical problem involving a chromosome inversion. Your task is to find out what the end products of meiosis (i.e. haploid cells) will be after a specific crossover during prophase I.

Here's how I solve this:

You can see a full-screen version by clicking on the YouTube link.

Holliday Structures inform the Meselson-Radding model of recombination

The videos demonstrate how the molecular mechanism of recombination can also be responsible for gene conversion. The organism in which gene conversion has been best studied is a mold that keeps its spores in a sac called an "ascus". As meiosis progresses, the cells are kept in a linear array which allows the scientist to follow the fate of each cell. You can map a gene with respect to its position from a centromere using this system.

But gene conversion, not mapping, is what I want to describe. When meiosis occurs, you should have equal numbers of each kind of allele when you end. For example, if you have a B and a b allele of a gene, the end product of meiosis, four cells, will contain equal numbers of B and b two cells will be B and two will be b. Recombination will assort the alleles, but you're not gaining or losing any genetic material. If the haploid cells at the end of meiosis undergo mitosis, you'll double the number of both alleles: you'll get four cells that contain B and four that contain b.

Robin Holliday, a geneticist, was studying asci and mapping out how the alleles segregated. He noticed a strange phenomenon, though... sometimes instead of getting 4 B alleles and 4 b alleles, he got 5 B and 3 b. Sometimes it was 6 B and 2 b. Even the reverse happened: 3 B and 5 b. Sometimes it was 2 B and 6 b. Apparently one of the alleles got changed - converted - to the other allele in the cross.

Holliday recognized that the presence of both alleles together in the zygote might lead to them interacting. Crossover is when nonsister chromatids are more likely to interact, and he thought of a model where the double-helix of each chromatid might unzip and then base-pair with its nonsister partner. They'd have essentially the same nucleotide order, except for the region that differs to make them allelic and not identical, and so base-pairing can connect these nonsisters together. They'll be cut apart to separate, and depending how you cut them, you might cause a crossover event in which new allele combinations on a chromosome result.

Here's a video (courtesy of Brooker's Genetics, 2nd edition, McGraw-Hill)

Holliday's model has inherent symmetry, though, and the converted asci don't typically have symmetry in their numbers. 5:3 is not symmetrical. Matthew Meselson and his colleague Charles Radding modified Holliday's theory by introducing assymetry: only one strand is cut and it dislodges the partner strand of the nonsister chromatid.

A further model, double-strand break repair, is a further refinement of the crossover model and is considered to be the most likely mechanism, although from what I hear, there are differences between species.

I like this concept as it demonstrates the progression of science, and how we can use concepts from different parts of the discipline to inform the other parts. This is a great combination of how classical and molecular genetics can cross-pollinate!

Elucidating the Codon Table with Random Copolymers

Before we got a codon table as we know it today, scientists had to figure out what the letters were in each of the 64 combinations of triplets of nucleotides. It is child's play to get all the possible codons, but what they mean is another story!

The level of technology at the time was that random polymerization in vitro (that is, in a testtube with enzymes, buffers, and whatever nucleotides the scientist put into the tube) was possible, but precise construction of synthetic mRNAs wasn't an option.

So, scientists created synthetic mRNAs that contained only, say, uracil and cytosine. To distinguish between U and C content of each codon, they ensured that the proportion of U would not equal C. Thus, they knew if 80% C and 20% U were used, the most common codon would be CCC and the least common would be UUU. The number of CCC would be 0.8³ and UUU would be 0.2³.

Using this logic, answer the following question:

A scientist creates a random copolymer from a solution containing all 20 amino acids, 40% guanine, 60% cytosine, and appropriate enzymes.

She then does in vitro translation and isolates only the oligopeptides that were formed.

Your job is to calculate which amino acids were in the polypeptides formed and the proportion of each represented.

Try this for yourself before accessing the answer below ...

Basic Genetics for University

Major Topics

Molecular Genetics