Introduction to Linkage Mapping

This handout is very similar to "How to Solve Linkage Map Problems"

Example 1: Two Point Cross

How do you tell your genes are linked?
How do you identify the parental linkage?
How do you calculate linkage map distance?

Mating: AaBb x aabb


A = Long antennae
a = Short antennae
B = Green eyebrows, and
b = Blue eyebrows.

How do you tell your genes are linked?

If the genes were independent, using standard genetic techniques you would predict that your offspring would have the phenotypic ratio:

1 Long Green : 1 Long Blue : 1 Short Green : 1 Short Blue

Given 2000 offspring, that would mean 500 of each of these phenotypic classes.

Say you make this mating, and your actual results look like this:

Long Green - 850
Long Blue - 150
Short Green - 150
Short Blue - 850

Obviously not 500 of each, and not close enough to be explained by chance variation. This, these two genes are not independent of each other (which was an assumption we made when making the prediction above).

What are the parental linkages?

The only parent that matters for figuring things out in this cross is the AaBb parent. It is the one determining the phenotypes of the offspring.

There are two ways these alleles could be linked in this parent. Either the two dominant alleles (A and B) are on one chromosome and the two recessive alleles on the other, or one chromosome has A and b, while the other has a and B. The former arrangement is called cis, the second trans.

Since linkage means that the genes are not behaving independently of each other, the parental linkages should be found more frequently in the offspring than the recombinant linkages. A crossover must occur between the loci of our two genes in order to produce a recombinant offspring, and since a crossover involves only two of the four chromatids of a synapsed pair of homologous chromosomes, each crossover will produce only two recombinants, the other two being parental. Thus parental linkages will always be more common in the offspring. Looking at our offspring, the two largest offspring classes are Long, Green and Short, Blue. Thus, in the heterozygous parent, the two dominant alleles were on one chromosome and the two recessive alleles were on the other. We started out with a cis arrangement.

How do you calculate linkage map distance?

One linkage map unit (LMU) is 1% recombination. Thus, the linkage map distance between two genes is the percentage recombination between those genes.

In this case, we have a total of 300 recombinant offspring, out of 2000 total offspring. Map distance is calculated as (# Recombinants)/(Total offspring) X 100. So our map distance is (300/2000)x100, or 15 LMU.

Linkage map units do not correspond to any fixed length of chromosome. Frequency of crossover (and thus of recombination) can be affected by location (crossing over is repressed close to the centromere) and proximity to another corssover. Also, since scoring of recombinant offspring ignores what else might be going on between the two genes in question, the number of recombination events is always underestimated because double crossovers are missed.

A distance of 50 LMU corresponds to independence. Note that, if two gene loci are far enough apart on a large chromosome, the crossover rate between them may be so high that they will map as if they are not linked.

Example 2: Two Point Cross

R = Round Nose
r = Pickle-shaped nose
Q = Stiff legs
q = Quivery legs

(There are strange creatures where I come from...)

RrQq is mated to rrqq.


Round Stiff - 200
Round Quivery - 600
Pickle Stiff - 600
Pickle Quivery - 200

Again, the existence of linkage is obvious. If there were no linkage, all phenotypic classes should be about equal in number.

In this case, the parental linkage is trans. The parental classes--always the largest classes--are Round Quivery and Pickle Stiff.

Never make the mistake of assuming your linkage will be cis.

Incidentally, the map distance between these two genes is (400/1600)x100 or 25 LMU.

Example 3: Three Point Cross

How do you tell your genes are linked?
How do you identify the parental linkage?
How do you calculate linkage map distance?

How do you tell your genes are linked?

The answer to this is pretty much the same for three point crosses as it is for two point crosses. First determine what you would expect of the genes were independent, then compare to the actual results of the mating.

Mating: DdFfGg x ddffgg


D = Calm personality
d = Dithery personality
F = Five toed
f = Four toed
G = Smooth fur, and
g = Grizzled fur.


Calm, Five, Smooth - 620 Dithery, Five, Smooth - 5
Calm, Five, Grizzled - 100 Dithery, Five, Grizzled - 75
Calm, Four, Smooth - 75 Dithery, Four, Smooth - 100
Calm, Four, Grizzled - 5 Dithery, Four, Grizzled - 620

Once again, assumption of independence would predict all phenotypic classes be about the same size. As above, the only parent influencing the diversity of the offspring is the heterozygous one. She can produce eight kinds of gametes (all eight possible combinations of her alleles), and all should be produced in equal numbers if the genes are independent. Since we obviously do not have equal numbers in our classes, we do not have independence. Our genes are linked.

How do you tell what the gene order is?

This task requires that you identify the parental classes and the double crossover classes among the offspring. The parental classes (there should be two, and they should be reciprocals) will be the largest classes. This is because the linkage makes the parental connections tend to reamin intact more often than they would if assortment were independent. The double crossover classes (again, two and reciprocal) should be the smallest classes. This is because a double crossover requires two crossover events between the end genes, which is statistically significantly less likely than a single crossover.

Double crossover offspring result from the occurrence of two crossover events, one between the first two genes in the sequence, and another between the second and third genes in the sequence. In effect, this means that only the middle gene has been recombined, and the two end genes remain in the parental configuration. (The first crossover swaps the chromatids, then the second swaps them back.)

To determine which gene is in the middle, you compare each double crossover class to the parental which is most like it. Whichever gene is different is the middle gene. The two double crossover classes should give the same result.

Parental Classes: Calm, Five, Smooth (620) and Dithery, Four Grizzled (620)

Double Crossover Classes: Dithery, Five, Smooth (5) and Calm, Four, Grizzled (5).

Comparing, it is evident that the middle gene is the personality gene (Calm/Dithery).

Rewriting the genes in the correct order:

Five, Calm, Smooth - 620 Five, Dithery, Smooth - 5
Five, Calm, Grizzled - 100 Five, Dithery, Grizzled - 75
Four, Calm, Smooth - 75 Four, Dithery, Smooth - 100
Four, Calm, Grizzled - 5 Four, Dithery, Grizzled - 620

How do your calculate the map distances?

Before you can calculate map distance, you have to figure out which single crossover classes represent which region of crossing over. This is done by identifying the reciprocal pairs of single crossovers, then once again comparing each to the parental class most like it. Noting which of the three genes has been switched will tell you where the crossover occurred.

The single crossovers between the toes and personality genes are Four, Calm, Smooth (75) and Five, Dithery, Grizzled (75). Note that both of these are different from the parentals only in the first trait.

The single crossovers between the personality and fur genes are Five, Calm, Grizzled (100) and Four, Dithery, Smooth (100). these differ from the parentals by only the final trait.

Linkage map distance is calculated just as for the two point cross, by calculating the percentage of the offspring showing recombination between the two genes in question.

Between the toes and personality genes, our two single crossover classes total 150 offspring (75 + 75). But recall that the double crossovers also recombined in this region, and thus need to be added to the single crossovers. There are 10 of them (5 + 5). So the total number of offspring which have recombined in this region is 160.

So the percent recombination is (160/1600)x100, or 10%; the linkage map distance between these two genes is 10 LMU.

Repeat the process with the second and third genes, again remembering that the double crossovers also recombined in this region, so the total number of recombinants is 210 (100 + 100 + 5 + 5).

Percent recombination is (210/1600)x100 = 13.125, or 13.125 LMU.

Finally, the distance between the end genes is 10 + 13.125 or 23.125 LMU.

Linkage Map Distances are Not Truly Additive

A linkage map unit doesn't actually correspond to a fixed length of chromosome. Non-randomness of crossover location (eg, crossover repression near the centromere) affects the relationship between the size of an LMU and actual chromosome size. And the distance calculated between two genes is situational. It depends upon just how carefully you are monitoring what is happening in the region between the two genes--specifically upon how many intervening genes you are paying attention to.

To demonstrate this, consider the three point cross above. our final calculated distance between the end genes (toes and fur) was 23.125 LMU.

What happens if we just ignore the personality gene in the middle? In other words, what if this were a two point cross between toes and fur?

This reduces our data to the following:

Five, Smooth - 625 Four, Smooth - 175
Five, Grizzled - 175 Four, Grizzled - 625

The double crossovers are swallowed by the parentals, because without noting the behavior of the middle gene, they look just like parentals. Each of these represents two crossovers between these two genes, and we've just lost their input. Not only that, but there are almost certainly many other intervening genes, and we're getting no input from those either.

Calculating the map distance, we have a total of 350 recombinants out of 1600 offspring. So the percentage recombination is (350/1600)x100 = 21.875%, so the distance bewteen these two genes is calculated as 21.875 LMU--significantly less than the 23.125 LMU calculated from our three point cross.

The message here is broader than just the implications for these two examples. There are many genes in a 10 LMU stretch of chromosome, regardless of the softness of the size of an LMU. This means that, in our three point cross, we are ignoring a lot of potential information that could come from monitoring those intervening genes. Also, if we were to do two three point crosses, one using our toes and personality genes, and a third gene from the region between those two, the other using our personality and fur genes, and additional gene from the region between them, we would end up with larger calculated distances bewteen toes and personality, and between personality and fur, and ultimately an even larger additive distance between toes and fur.

So LMU are only roughly additive, and an LMU does not correspond to a specific length of chromosome. The linkage map distance between two genes is a function of focus--the sharper our focus on the events between the two genes, the larger the calculated distance between them will be