Sunday, 26 April 2015

Tracking DNA segments through time and space

One of the exciting aspects of autosomal DNA testing is that it gives us the opportunity of assigning segments of DNA to specific ancestors and then tracking the inheritance of those segments over time. To date the only match where I've been able to make the genealogical connection other than with immediate members of my family is with a fourth cousin, Mr K, who lives in Canada. I wrote previously about this match in my article on My first autosomal DNA success story. That match was very easy to identify because all my ancestors are in the UK and all Mr K's ancestors are in Canada and there was only one possible line where we could connect. It also helped that one of the shared surnames - Cruwys - is very rare. We can therefore state with confidence that the segment we share has been inherited from our ancestors William Cruwys (1793-1846) and Margaret Eastmond (1792-1874) who married in Rose Ash, Devon, on 18th July 1814.

I'd already tested my parents but one of my sons has now also taken the Family Finder test which gives me the opportunity to explore the inheritance patterns of these shared DNA segments in more detail.

In the screenshot below I've compared my dad with Mr K. They are third cousins once removed. They share three large segments in common: 20.12 centiMorgans on chromosome 1; 23.33 centiMorgans on chromosome 3 and 17.12 centiMorgans on chromosome 11.
Next I used the chromosome browser to compare myself with Mr K. Mr K and I are fourth cousins. You can see that two of the segments that my dad inherited have not been passed on to me and I only share a single segment on chromosome 11 with Mr K. This segment is 16.62 cMs and has been passed on virtually intact from my father to me.
The next screenshot shows a comparison with my son and Mr K. They are fourth cousins once removed. You can see that my son has inherited exactly the same segment as me. This segment measures 16.85 cMs and appears to be slightly larger in my son than it is in me, which is perhaps something to do with the rounding that FTDNA uses.
For all the above screenshots I've set the threshold at 5 cMs. Family Tree DNA are the only company who provide segment data under the 5 cM threshold. There has been much debate in the genetic genealogy community on the subject of small segments under 5 cMs, but there is a consensus that the vast majority of the small segments generated by the current chip tests are false positives and are nothing more than random noise. However, now that I have tested three generations of my family and we also have a match with a known cousin, I thought I would take the opportunity to do a comparison of the small segments to satisfy my own curiosity.

The screenshot below is taken from the perspective of my son, and I've set the threshold to 1 cM. The chromosome browser shows the segments my son shares in common with me, his maternal grandfather and his cousin Mr K. The segments shared with Mr K are shown in orange. The segments he shares with his grandfather are shown in green. The blue segments are shared in common with me. A child receives 50% of his DNA from his mother so my son matches me across the entire length of each chromosome. (Note that chromosomes come in pairs - we receive one set of chromosomes from our mother and one set of chromosomes from our father. However, the chromosome browser shows matches on a single chromosome and is unable to identify whether the match is on a maternal or a paternal chromosome.) There are 13 small segments that my son appears to share with Mr K. However, ten of these segments are seemingly shared by me, my son and Mr K but are not shared with my father. Clearly this is a biological impossibility because if a segment is identical by descent (IBD) then by definition it must have been passed on from a parent to a child and it couldn't possibly skip a generation. There are tiny segments on chromosome 6, chromosome 10 and chromosome 16 that are shared by all of us and these segments are therefore more likely to be IBD.

There have been suggestions that the process of triangulation (identifying three or more segments which match on the same chromosome) confirms that the segments are "real" or in other words that they are identical by descent (IBD). In this case the 13 small segments all triangulate with three people - me, my son and Mr K. However, when my dad is added to the mix we can see that the triangulation process breaks down. If the small segments were IBD then my dad should match on all of these small segments.

In the future when whole genome sequencing becomes the norm it should be possible to use small segments for genealogical matching purposes but with the limitations of the current technology extreme caution should be used when drawing conclusions about matches on small segments.

Further reading
The ISOGG Wiki article on identical by descent has further information on this subject:


There have been a number of blog posts that have dealt with the subject of small segments and they are all linked on the ISOGG Wiki page. I particularly recommend reading the following:


- Genealogy and autosomal DNA matches: common errors in “proving” an ancestor, and the allure of easy gateway ancestors by "Our Puzzling Past"

- Chromosome Pile-Ups in Genetic Genealogy: Examples from 23andMe and FTDNA by "Our Puzzling Past"



- What a difference a phase makes by Ann Turner (a guest post on Blaine Bettinger's blog)

Disclosure
I received a free DNA test from Family Tree DNA in compensation for speaking at Who Do You Think You Are? Live in 2014. I chose to have a Family Finder which I used to test my son.

© 2015 Debbie Kennett

9 comments:

Louis Kessler said...

Debbie,

You say: "There have been suggestions that the process of triangulation ... confirms that the segments are ... are identical by descent (IBD). ... However, when my dad is added to the mix we can see that the triangulation process breaks down. If the small segments were IBD then my dad should match on all of these small segments."

Roberta Estes said in https://dna-explained.com/2016/07/21/nine-autosomal-tools-at-family-tree-dna/ that "technically, it’s not triangulation in cases where very close relatives are involved. For example, parents, aunts, uncles and siblings are too closely related to be considered the third leg of the triangulation stool."

So could it be your example above does not disprove that small triangulated segments are not IBD since you are using parent/child relationships to be one of the legs?

Debbie Kennett said...

Louis

If a segment is IBD by definition it has to have been passed from a grandparent to a parent and to a child. IBD segments can't skip a generation. I therefore don't understand Roberta's point. If a segment is IBD it must triangulate with a grandparent as well as a grandchild.

We can only confirm that segments are IBD by phasing. Phasing is a complicated technical process. You can find an explanation here:

http://www.isogg.org/wiki/Phasing

So far the hypothesis that triangulated segments are always IBD (or triangulated segments over a certain size are always IBD) has not been tested. We need to see if these triangulated segments withstand phasing.

You have to remember that at FTDNA, 23andMe and GedMatch we are working with unphased data so it's much easier to generate false matches.

Louis Kessler said...

Debbie,

What you haven't shown us is where your father and Mr K match. I'd bet that they match on those three small segments your father has in common with your son and Mr K, and I'd also expect that your father and Mr K don't match on the other 10 small segments. The other 10 then would be Identical By Chance (IBC) with your son and that is okay since there is no triangulation in this case (son matches Mr K, son does not match your father, your father likely does not match Mr K)

Debbie Kennett said...

Louis

If it's any of interest I can e-mail you a screenshot of my dad's matches with Mr K when the threshold is dropped to 1 cM. He has the three big segments all over 10 cMs in size. He then seemingly shares a further eight small segments under 5 cMs in size with Mr K, most of which do not correspond with the segments my dad shares with my son. I don't understand what you mean by IBC with my son. By IBC do you mean false matches? I can't tell whether small segments that my dad shares with Mr K are false matches by comparison with my son because you would expect 50% of small segments to be lost each generation anyway. The only way to determine whether matches are real is by phasing and detailed chromosome mapping.

Louis Kessler said...

Debbie,

Yes. Identical By Chance (IBC), which I believe is now the preferred term over Identical by State (IBS), is the false positive.

You are correct. Because your son gets about one quarter of his segments from your dad, you can't tell whether the segments are false matches or not. That is what Roberta is telling us, that you can't use parent/child or siblings as 2 of the 3 people in triangulation. See says that again in her article today: https://dna-explained.com/2016/08/10/concepts-match-groups-and-triangulation/

But triangulation is another, and very valid way to determine whether matches are real (Identical by Descent): http://originsdna.blogspot.ca/2015/03/triangulated-small-segments-are.html

Yes, I am interested in seeing your Dad's matches with Mr K. I'll send you an email.

Louis

Debbie Kennett said...

Louise

Identical by descent and identical by state are the two scientific terms. I know some people are using IBC to describe false matches but I don't personally like the term as these matches aren't identical. However, IBS is not the same as a false match. Segments can be IBS for many reasons. IBD is a subset of IBS. I suggest you watch this excellent video from the AncestryDNA scientists which explains some of these concepts:

https://www.ancestry.com/academy/course/dna-genetic-genealogy

The article you cite on the Origins DNA blog makes a lot of false assumptions. You can't use cumulative probabilities based on individual alleles. Alleles are passed on in segments not individually. In any case for research to be valid it needs to be published in a peer-reviewed scientific journal and not on a blog.

We don't know if triangulation is a substitute for phasing to determine whether a segment is real or not because the hypothesis has not yet been tested. Even after phasing you still have to to determine whether a segment is recent IBD or ancient IBD. If a lot of people match on the same segment then it's much more likely to be ancient IBD or IBS for some other reason that we don't yet understand.

See my previous two blog posts on triangulation:

http://cruwys.blogspot.co.uk/2016/01/autosomal-dna-triangulation-part-1.html


http://cruwys.blogspot.co.uk/2016/01/autosomal-dna-triangulation-part-2.html


Thanks for the e-mail. I'll reply to that separately.

Louis Kessler said...

Debbie,

The reason why I started using the term IBC instead of IBS is because Roberta Estes says the IBS term is archaic and IBC should be used instead: https://dna-explained.com/2016/03/10/concepts-identical-bydescent-state-population-and-chance/

Whether or not the OriginsDNA article is accurate, the main point about triangulation is that it exponentially reduces the probability of a chance match because each person now must match two others by chance rather than just one by chance.

Believe me, I've already read and studied both of your excellent posts on triangulation a dozen times each.

Louis

Louis Kessler said...

Debbie,

Kitty Cooper explains the adoption of the term IBC here: When is a DNA segment match a real match? IBD or IBS or IBC?

Louis

Debbie Kennett said...

Louis

Roberta is not a scientist. I don't know why she says IBS is archaic. It is a standard term used in population genetics that appears in all the textbooks. See Ann Turner's blog post on the anatomy of an IBD segment:

https://segmentology.org/2015/10/02/anatomy-of-an-ibs-segment/

I've seen Kitty's post. I know that some people are using the term IBC. I prefer to stick to the scientific terminology and to use "false matches" for the pseudosegments.