Wednesday, 26 July 2017

Parent and child comparisons at MyHeritage DNA

I recently transferred my parents' data to the MyHeritage DNA database and I thought it would be an interesting exercise to compare their matches and admixture reports with my own. I transferred my AncestryDNA v1 raw data to MyHeritage and my parents' Family Finder raw data from Family Tree DNA. All three tests were done on the same Illumina OmniExpress chip so there should be an almost complete overlap of SNPs.

MyHeritage are the newest entrant into the genetic genealogy market. They launched their autosomal test in November 2016. If you've tested with AncestryDNA, Family Tree DNA or 23andMe it is currently possible to do a free transfer to MyHeritage. It is not clear if the transfer will be free in the long term so do take advantage while you have the chance.

While the MyHeritage database still has a long way to go to catch up with the other companies there are already early reports of DNA success stories. MyHeritage benefits from a website which is available in many different languages, and they are therefore likely to attract customers who will not be found in any of the other databases.

DNA matches
MyHeritage currently provide information about the amount of DNA shared (measured in centiMorgans), the number of shared segments, and the size of the largest segment. A chromosome browser is not provided though this feature is reportedly in development. It is also not yet possible to download a list of your matches, but hopefully this will be possible in future.

One of the most useful features of the MyHeritage matches feature is that there are country flags against the names of your matches. This allows you to focus on the matches who live in the countries where you are mostly likely to share recent genealogical ancestors.

My dad currently has 59 matches at MyHeritage (excluding me as his daughter). Most of his matches are in America but he has four matches with people from Great Britain, three from Sweden, and one each from the Czech Republic, Canada and Norway.

My mum has 20 matches at MyHeritage (excluding me as her daughter). Again the matches are predominantly with Americans but she has two matches with people from Great Britain and one with an Australian.

I have 24 matches at MyHeritage (excluding my parents). I have one match each from Luxembourg, Great Britain, Australia and Ireland. The rest of my matches are in America.

MyHeritage have a nice Shared DNA Matches feature which not only allows you to see which matches you have in common but also provides relationship predictions and the amount of shared DNA for both matches side by side. This is what the Shared DNA Matches page looks like for me and my mum.

I share two of my 24 matches with my mum and six matches with my dad. However this means that 17 of my 24 matches (71%) do not match either of my parents. These matches are either false positives or false negatives, but without further investigation it's not possible to tell.

I don't recognise any of the names in the match lists and it seems to me that, even if the matches are real, the relationship predictions are overly optimistic. Some of the matches are predicted to be second to fourth cousins, and even the most distant matches are predicted to be third to sixth cousins. However, I do not have any ancestors who emigrated to countries like Sweden, Norway and the Czech Republic. I also don't have any ancestors who emigrated to America. I do have a few cousins in America through a collateral line but I know them all by name. The Americans on my match list are likely to be very distant cousins, if they are related to me at all. Of the matches that I share with my parents all eight of them are in the US.

Clearly MyHeritage need to do some work on the matching algorithms, and I'm sure we will see some improvements in future. For the moment it doesn't seem worth investing too much time in researching these matches.

Comparing admixture percentages
In addition to cousin matching, the MyHeritage test also includes a free admixture report which they call an Ethnicity Estimate. Results are compared with 42 reference populations around the world, and there are plans to add further populations in the future. MyHeritage do not state what time depth their test is designed to cover.

Here are the details of my dad's genealogical ancestry:
  • Four grandparents born in England: Bristol, Gloucestershire, London (x2).
  • Eight great-grandparents born in England: Bristol (x2), Devon, Essex, Gloucestershire, Hertfordshire, London (x2).
  • Fifteen great-great grandparents born in England: Devon (x2), Bristol, Essex, Gloucestershire, Hertfordshire (x 2), London. One great-great grandparent born in Scotland (location not known). The birthplace of seven of his English great-great-grandparents is unknown. Four were probably born in Bristol or in a nearby county. Three were Londoners who could have moved to London from anywhere in England. 
Here are my dad's admixture percentages from MyHeritage.

Here are the details of my mum's genealogical ancestry:
  • Four grandparents born in England: London (x2), Hampshire (x2).
  • Eight great-grandparents born in England: Berkshire, Hampshire, London (x3), Somerset, Wiltshire. The birthplace of one great-grandparent is not known but he was probably born in London.
  • One great-great-grandparent born in County Kerry, Ireland. Fifteen great-great-grandparents born in England: Bedfordshire, Berkshire (x2), Gloucestershire, Hampshire (x2), Hertfordshire, London (x2), Somerset (x2), Wiltshire. The birthplace of three of her English great-great-grandparents is unknown. One was probably born in Hampshire. The other two were probably Londoners who could have come from anywhere in the country.
Here are my mum's admixture percentages from MyHeritage:
Here are my admixture percentages from MyHeritage.
It's good to see that MyHeritage are at least trying to produce regional distributions within the British Isles, even though the results are somewhat off the mark. It's interesting to see that my parents come out with such very different results, despite the fact that they both have predominantly English ancestry. We have no Italian ancestry and the Italian component in the MyHeritage test does not show up in our results in tests with any other company. The admixture reports will no doubt be refined in future as the methodology improves and more reference datasets are added.

Update 12th November 2017
We have been getting reports of close matches which have been incorrectly reported at MyHeritage DNA. Lorna Henderson has reported problems with a second cousin match which was identified at other companies but not at MyHeritage DNA. CeCe Moore has blogged about  a number of cases where half-siblings were reported as sharing an unexpectedly low amount of DNA. Yaniv Erlich, MyHeritage's Chief Scientific Officer, has responded on CeCe's blog and says that "We are well aware of these issues that affect a minority of our close matches. My team is actively working on this and we are in the final steps of a major overhaul to our matching system that resolves many of these issues and better tunes our parameters for our fast growing database." Let's hope that these issues are resolved sooner rather than later.

At present I do not advise trusting the matches reported by MyHeritage. If you've tested at MyHeritage I would recommend in the first instance that you download your raw data and take advantage of the free transfer to Family Tree DNA. Note that MyHeritage DNA transfers are exempt from the $19 fee to unlock the chromosome browser and MyOrigins reports. This will allow you to do a double check on the amount of shared DNA and access a different database of matches. When calculating the cM totals at FTDNA be sure to exclude all the small segments under 5 cMs or 7 cMs to get a more accurate reflection of the relationship.

Further reading
Related blog posts

8 comments:

Jules van Laar said...

Interesting.

On MyHeritage:

I have 8 matches. Four of them I do not share with either parent. Of those four I do share two that are my parents (my parents are distant relatives on more than 8 different lines) so I share my dad as a match with my mom and my mom as a match with my dad. So I'm at 50%.

My mom has only 3 matches (excluding myself). I share my dad with her and I share one other match. This shared other match seems to be genuine. Although from Austria, her surname is of my ethnicity and a preliminary search in newspaper archives hints that she is from my ancestral area.

My dad has 57 matches (excluding myself). I share my mom and one other match with my dad. This other match I recognize as also being a match on 23andMe. The surname is English with no hints as to any relation to my ancestral area.

What startles me is how many matches my dad has and how few me and my mom have. On 23andMe my parents are pretty balanced in terms of matches with both having 1200+. On FamilyTreeDNA my dad has about 300+ matches and my mother 200+ matches. Maybe in the grand scheme of things 5 or 60 matches don't seem to matter compared to a database that can give 1200 matches. My mom might be lagging behind now but it might even out, still seems a bit odd.

I think it might be time to add my sisters to the MyHeritage database. They might match some of these matches of my parents that I do not match.

Debbie Kennett said...

Jules, It sounds like you have your work cut out interpreting your matches when your parents are related to each other on so many distant lines. The more close relatives you test the better in your situation.

I was also surprised at the gender imbalance. I can't think of an explanation.

Anna Brueton said...

This post is about comparisons between My Heritage and FTDNA in general, rather than parent-child comparisons. My mother's ancestry was almost wholly from South Wales (just 1 English g-g-grandfather) and my father was Estonian. He did not know his father's identity, though he was definitely conceived in Estonia.

The results are surprisingly similar:

FTDNA My Heritage
British 62% Irish/Scottish/Welsh 59.2%

East Europe 17% Baltic 25.5%
Finnish 20% Finnish 13.3%

Non-European <1% Native American 2.0%

I can believe that it is quite difficult to distinguish Baltic/E. Europe/Finnish, and I expect that the non-European is Siberian or Sami rather than Native American! What surprised me was that my British component was substantially more than 50%. Is this just a lack of precision in the methodology, or it is evidence that my unknown grandfather was partly British?

I would agree that My Heritage predicted relationships are optimistic, but no more so than those of FTDNA. Although I have 10 predicted 2nd-4th cousins and 119 3rd-5th cousins on FTDNA, none match with my paper trail and most seem very unlikely to be related within the last 250 years.

Jules van Laar said...

I once started researching an American match with surnames that are very specific to my ancestral area, she was a match to my mother. As I suspected her family emigrated. I started making a tree for the match, focusing on surnames and villages that are known in my ancestry, I ended up linking her to my tree, but I did it through my father's ancestry; but she wasn't even a match for him, so I went on working through other lines.

I did eventually find a link to my mother's side of the family, but because I think any family from this area that has been there for the last 400+ years is probably endogamous, she's probably related to my parents on multiple lines too, so I don't know if the particular shared ancestor is the ancestor related to the shared segment.

Debbie Kennett said...

Anna

I don't think you can read too much into these admixture percentages. I don't believe there are any reference samples from Estonia so you're being matched against other European samples instead. The Finns were historically a genetic isolate so Finnish DNA does usually show up but perhaps Finnish in your case is serving as a proxy for Estonian. The Native American is probably just noise. It's best to focus on your matches and look at their genealogies rather than making inferences from your MyOrigins results.

The FTDNA relationship predictions are just as much of a problem. FTDNA include small segments under 5 cMs to make the relationship predictions. Most of these small segments are false matches but they have the effect of falsely inflating the cM count. You'll get more realistic estimates if you recalculate the numbers and exclude the small segments.

Debbie Kennett said...

Jules

Once we get back beyond about five or six generations there's a much greater possibility that we will find ancestors to whom we're related in multiple ways as a result of cousin marriages. There was a very small founding population in Colonial America and that seems to have created a bottleneck in the first 200 years of settlement. It does seem to be the case that most Americans with Colonial ancestry are related to each other in multiple ways in a fairly recent timeframe. You can see this with their shaky leaf hints at AncestryDNA. Sometimes they can have five more possible ancestral couples highlighted for a single DNA match!

Anna Brueton said...

Debbie

Thank you for the tip about excluding small segments - I'll try it out.

I had expected that Finnish would be a proxy for Estonian - they are linked historically and have very similar non Indo-European languages, though no doubt Estonians are more mixed genetically, as they were less isolated geographically.

I have several matches with Finns, but as they are scattered over quite a wide area, I'm inclined to think think they are false positives.

Anna

Debbie Kennett said...

Anna

It is still very difficult with the current matching algorithms to draw conclusions about matches much beyond the fourth or fifth cousin level. All the companies are delivering quite a lot of false positive matches though MyHeritage have by far and away the worst rate at present. However, I have faith that they will make improvements in the near future.