Friday 15 April 2022

Comparing ethnicity estimates and ethnicity inheritance results from AncestryDNA for a child and her parents

I wrote about AncestryDNA's new SideView technology and the new ethnicity inheritance tool earlier this week. My results for my parents weren't yet available when I wrote the post and I thought it would be interesting to do a three-way comparison.

Ethnicity estimates


Debbie's dad
My dad's ancestry within the last few hundred years is all English apart from one maternal great-great- grandfather who is from Scotland. His paternal side is from Devon, Bristol and Gloucestershire. His maternal side is from Essex, Hertfordshire, Oxfordshire and London. Here is his updated ethnicity estimate.

The ranges are:
  • England and Northwestern Europe 49% to 69%
  • Wales 2% to 28%
  • Scotland 0% to 30%
  • Sweden and Denmark 0% to 16%
  • Norway 0% to 15%.
The ranges can be found by clicking on the country names.

Debbie's mum
My mum's ancestry in the last few hundred years is all from England apart from one paternal great-great-grandmother from Ireland. Her paternal ancestry is from Hampshire (via London), Somerset and Ireland. She has an unknown paternal great-grandfather who is probably from Oxfordshire. Her maternal ancestry is from Hertfordshire, London, Hampshire, Berkshire, Bedfordshire, Buckinghamshire, Gloucestershire and Wiltshire. Here is her updated ethnicity estimate.

The ranges are:
  • England and Northwestern Europe 70% to 100%
  • Wales 0% to 17%
  • Ireland 0% to 17%
  • Norway 0% to 5%.
Debbie
Here is my updated ethnicity estimate.
The ranges are:
  • England and Northwestern Europe 65% to 99%
  • Scotland 0% to 29%
  • Wales 0% to 18%
  • Ireland 0% to 9%
I am disregarding Norway, Sweden and Denmark in the results for my parents as noise. The reference population labelled as Wales actually stretches into North Devon, North Somerset, Bristol, Gloucestershire and Herefordshire so this may be a real representation of my dad's paternal ancestry from Devon, Bristol and Gloucestershire and my mum's maternal ancestry from Gloucestershire. There was a big migration from North Devon to South Wales in the 1800s with people moving to Wales to work in the copper and coal mines so many people from South Wales have Devon ancestry which may account for the overlap. The Scotland component is over-represented in many people at AncestryDNA and this reference population probably should have been labelled as Scotland, Ireland and England.

The genetic communities (also known as regions) are uncannily accurate. It's interesting to note how I get Gloucestershire, Wiltshire and Oxfordshire as a region despite the fact that neither of my parents has this region. This can easily explained by the fact that both of my parents have ancestry from both Gloucestershire and Oxfordshire so I actually get a double dose of DNA from these counties.

Ethnicity inheritance overview


Debbie's dad
This is the ethnicity inheritance overview and detailed comparison for my dad. If the Welsh component represents my dad's paternal ancestry from Devon, Bristol and Gloucestershire then Parent 1 is his dad. However, my dad's Scottish ancestry is on his mother's side yet the Scottish percentages appear in both parents but are much higher for Parent 1 than Parent 2. I therefore do not feel confident in assigning parental sides to these results.

Debbie's mum
This is the ethnicity inheritance overview and detailed comparison for my mum. Ireland only appears in Parent 1 and Wales only appears in Parent 2 so I am assuming that Parent 1 is her father and Parent 2 is her mother.

Debbie
Here is my ethnicity inheritance overview and detailed comparison. The Scotland component is the odd one out here as it appears in both parents whereas it is only reported in my dad's results. Ireland only appears in Parent 1. I had originally assumed therefore that Parent 1 is my mum. However, the absence of Wales on my dad's side is surprising given that he had a much higher percentage of the Welsh component than my mum so it's quite possible that Parent 1 and Parent 2 are the other way round instead.

Conclusion

SideView is an innovative new technology from AncestryDNA and I remain excited by the possibilities it has opened up. While the "ethnicity" estimates are still a work in progress they are a huge improvement compared to the early days of autosomal DNA testing. When I received my first biogeographical ancestry report from 23andMe back in 2010 they were only able to tell me that I was 100% European. We are now getting much more granularity within Europe, even if the country-level assignments, especially with the low percentages, are not very accurate. We can expect the results to improve over time. AncestryDNA regularly add new regions and provide updated ethnicity estimates every year. We can probably expect a further update this summer or in the autumn. 

The genetic communities are worked out in a different way and are based not on reference populations but on large segments of DNA shared in genetic networks. They are reflective of our recent ancestry within the last 200-300 years. They are remarkably accurate and for most people generally correspond with their known ancestry. Another possible application for the SideView technology would be to assign genetic communities to parental sides though, as in my case, it may be that some communities would need to be assigned to both sides. We have much to look forward to this year!

Wednesday 13 April 2022

SideView: a new method for assigning matches and biogeographical ancestry to paternal and maternal sides at AncestryDNA

I am fortunate that I have been able to test both of my parents at AncestryDNA which means that I am able to determine whether my matches are on the paternal or maternal sides. For matches sharing over 20 centiMorgans (cM) AncestryDNA automatically label matches as belonging to the father's side or the mother's side if you have tested your parents. This does of course mean that matches sharing lower amounts of DNA are not labelled, though I can still check to see which parent matches my cousins and I can assign the match manually using the relationship assignment tool. Some matches cannot be assigned to a side as they do not match either of my parents and are therefore probably false matches, though this is less of a problem now that AncestryDNA have removed all the 6 and 7 cM matches which accounted for the bulk of the false matches.

Sorting matches into paternal and maternal sides is a much more difficult and time-consuming process when working with DNA results when data from the parents is not available. When working with my parents' matches I use the Shared Matching Tool and the coloured dots to group the matches into clusters. If I can work out the relationship with the match or assign a common ancestral couple to the cluster I can then manually assign the matches in the cluster to the paternal or maternal side but this is a slow and laborious process.

Wouldn't it be wonderful if the parental sides could be determined automatically? Fortunately that is now likely to be a reality in the very near future thanks to some ground-breaking new research from the scientists at AncestryDNA. 

AncestryDNA have developed a new methodology known as SideView which allows them to separate out the DNA inherited from each parent throughout the genome without the parents taking a DNA test. SideView will be used to power a number of new DNA features at AncestryDNA in the coming months. It will eventually allow us to see our matches separated by parental side and there will be genetic community and journey patterns for each parental side. The sides will be labelled for all of our matches down to 8 cM but the methodology is not perfect and there will be around 15% or 20% of our matches that aren't labelled. There will also some people with both parents falling in the same group and their matches will be labelled as both sides. This applies to about 3% of the AncestryDNA database.

The technology also opens up many exciting possibilities for the future. AncestryDNA now provide traits reports so I wonder if it might one day be possible for them to identify which traits have been inherited from each parent.

"Ethnicity" inheritance
The first feature enabled by this new SideView technology is known as ethnicity inheritance. If you log into your Ancestry account you should now see this new feature which allows us to see which biogeographical ancestries we have inherited from each of our parents.  It may take a while for the feature to roll out to the entire database. (As a side note, the term ethnicity in this context is a misnomer because ethnicity refers to our social, cultural, religious and linguistic heritage and is not necessarily a reflection of our genetic ancestry inheritance though there is often some overlap.) Despite the quibble about the name, this is potentially a very useful tool, particularly for those people who know nothing about their ancestral origins.

In addition to the rollout of the ethnicity inheritance feature, our "ethnicity" estimates have also been updated based on the new technology, though no new regions have been added in this latest update. Here is my updated "ethnicity" estimate: 

England has gone up from 71% to 77% since the last update in July 2021. Scotland has dropped from 20% to 14%, Wales has decreased from 8% to 6% and Ireland has gone up from 1% to 3%. The results are not too far off my documented ancestry though the Scottish percentages are still too high. I have one maternal great-great-great grandparent from Ireland and one paternal great-great-great grandparent from Scotland. All my other documented ancestry is from England. Ancestry's Wales region extends across the English border into Gloucestershire which probably explains my Welsh assignment.

As a reminder, it's always worth clicking on the country name to see the ranges for each different ancestry. As you can see, the range for my Scottish component is anywhere between 0% and 29%.

When you log into your account you will now be invited to view the new "ethnicity" inheritance feature.
Ancestry explain that your parents contributed to half of your DNA. You can see which ancestries you have inherited from each parent even if they haven't taken a DNA test. It is important to remember that this is not providing an estimate for our parents as we only have 50% of our DNA from each of our parents so they will have DNA that we don’t. If you are able to test your parents then you will receive insights into the DNA that they have inherited from your grandparents. For those of us who have tested our parents the ethnicity inheritance feature is currently based on the SideView technology rather than using the data from your parents, though of course your parents are your matches so they are considered part of the process.

If you click on "View breakdown" you will be able to see your overview report comparing your ancestry breakdown with the two halves inherited from your parents.

There is also a detailed comparison showing the information in a tabular format.

The algorithm is not able to identify which parent has contributed the ancestries to the different parental sides but from the evidence of my family tree I can infer that Parent 1 with England, Scotland, Wales and Ireland is my mum (though she has no Scottish ancestry) and Parent 2 is my dad. Ancestry intend to provide us with an Edit Parents button which will allow us to label the parental sides if you are able to determine which parents have contributed the different ancestries to your DNA. The "ethnicity" features will be integrated with the match lists and vice versa so any changes you make will be reflected throughout the entire website.

Technical details
The new SideView system uses the power of AncestryDNA's massive database of over 21 million people and the vast networks of shared matches. It works on the premise that the DNA we share with our matches is only shared on one parental side. This means that if the matches can be sorted into two separate groups it will be possible to determine which side of your DNA is associated with each parent. Ancestry have found a way to assign matches into parental groups by looking at the segments of DNA shared in common with our matches. 

AncestryDNA claim that because of the size of their DNA database, SideView groups matches with a precision rate of 95% for 90% of Ancestry customers. With 11.5 million people in the database the accuracy drops to 85% for the majority of customers. With a database of five million the accuracy is around 65% and with a million people it is 50%.

Ancestry have published an excellent article in their Support Centre explaining how the SideView technology works.

There is also a new article in the Support Centre explaining how "ethnicity" inheritance works.

They are also planning to have some interactive educational material available soon using animated GIFs to explain how DNA inheritance works and to give people a better understanding of the new SideView technology. 

The full technical details of the methodology behind SideView have been published in a new preprint by the AncestryDNA scientists Keith Noto and Luong Ruiz. The paper is entitled "Accurate genome wide phasing of IBD data" and is available on the BioRxiv preprint server: https://doi.org/10.1101/2022.04.11.487932

AncestryDNA have also lodged a patent application with the United States Patent and Trademark Office which provides further highly technical information: https://uspto.report/patent/app/20210034647

There will also be an updated "ethnicity" white paper which will be available in a couple of months.

Further reading 

A change for e-mail subscribers to this blog

Time has flown by and for one reason or another I've not had much time for blogging in the last year. I note that my last blog post was published on 19th March 2021. I hope to rectify that in the coming weeks and months. A lot has been happening in the genealogy and genetic genealogy worlds and we can look forward to many exciting developments.

I previously used the Feedburner service to send out e-mails to my blog subscribers. Feedburner was acquired by Google in 2007. In 2021 Google announced that they were going to focus on the core Feedburner features and would no longer be supporting e-mail subscriptions. I therefore had to look for a new solution. On the recommendation of my friend David Pike, I settled on a program called Mailer Lite. There was quite a steep learning curve but I hope I've now got everything sorted. If you were previously subscribed to my blog you've now been transferred to the new list. If you're reading this in your e-mail client then the transition has worked successfully. You may wish to add my e-mail address to your address book to ensure that the e-mails don't go into spam. I will monitor how the transition goes. If anyone has any problems do get in touch.