Friday, 28 July 2017

DNA surprises

In all my genetic genealogy talks I always warn people to be prepared for the unexpected when taking a DNA test. DNA is a very powerful tool for the genealogist but it can also uncover family secrets and reveal close relations that we didn't know existed. Furthermore, we don't always get the answer we expected. As Bennett Greenspan of Family Tree DNA often says in his talks: "If you don't want to know the answer, don't ask the question". For some people DNA can completely overturn their concept of identity and they discover that they are not who they thought they were.

Sometimes DNA can reveal the most incredible stories that are stranger than anything in fiction. One such story has just been published this week in The Washington Post. The article focuses on a number of surprise findings from DNA testing but tells in detail the story of Alice Collins Plebuch who took a DNA test with AncestryDNA which was to change her life forever. The article is is a long read but a very worthwhile investment of your time.  The journalist Libby Copeland is to be congratulated for her sensitive coverage of this story and her meticulous attention to detail. You can read the article by clicking on this link.

We have in fact known about this story in the genetic genealogy community for several years now, but this is the first time it has been picked up by the mainstream media. If you want some further background information check out this article on CeCe Moore's blog where the story was first revealed. There is an additional perspective in this blog post. Both of these blog posts also have additional photographs that weren't in The Washington Post article, but don't read the blog posts until you've read the article.

Wednesday, 26 July 2017

Parent and child comparisons at MyHeritage DNA

I recently transferred my parents' data to the MyHeritage DNA database and I thought it would be an interesting exercise to compare their matches and admixture reports with my own. I transferred my AncestryDNA v1 raw data to MyHeritage and my parents' Family Finder raw data from Family Tree DNA. All three tests were done on the same Illumina OmniExpress chip so there should be an almost complete overlap of SNPs.

MyHeritage are the newest entrant into the genetic genealogy market. They launched their autosomal test in November 2016. If you've tested with AncestryDNA, Family Tree DNA or 23andMe it is currently possible to do a free transfer to MyHeritage. It is not clear if the transfer will be free in the long term so do take advantage while you have the chance.

While the MyHeritage database still has a long way to go to catch up with the other companies there are already early reports of DNA success stories. MyHeritage benefits from a website which is available in many different languages, and they are therefore likely to attract customers who will not be found in any of the other databases.

DNA matches
MyHeritage currently provide information about the amount of DNA shared (measured in centiMorgans), the number of shared segments, and the size of the largest segment. A chromosome browser is not provided though this feature is reportedly in development. It is also not yet possible to download a list of your matches, but hopefully this will be possible in future.

One of the most useful features of the MyHeritage matches feature is that there are country flags against the names of your matches. This allows you to focus on the matches who live in the countries where you are mostly likely to share recent genealogical ancestors.

My dad currently has 59 matches at MyHeritage (excluding me as his daughter). Most of his matches are in America but he has four matches with people from Great Britain, three from Sweden, and one each from the Czech Republic, Canada and Norway.

My mum has 20 matches at MyHeritage (excluding me as her daughter). Again the matches are predominantly with Americans but she has two matches with people from Great Britain and one with an Australian.

I have 24 matches at MyHeritage (excluding my parents). I have one match each from Luxembourg, Great Britain, Australia and Ireland. The rest of my matches are in America.

MyHeritage have a nice Shared DNA Matches feature which not only allows you to see which matches you have in common but also provides relationship predictions and the amount of shared DNA for both matches side by side. This is what the Shared DNA Matches page looks like for me and my mum.

I share two of my 24 matches with my mum and six matches with my dad. However this means that 17 of my 24 matches (71%) do not match either of my parents. These matches are either false positives or false negatives, but without further investigation it's not possible to tell.

I don't recognise any of the names in the match lists and it seems to me that, even if the matches are real, the relationship predictions are overly optimistic. Some of the matches are predicted to be second to fourth cousins, and even the most distant matches are predicted to be third to sixth cousins. However, I do not have any ancestors who emigrated to countries like Sweden, Norway and the Czech Republic. I also don't have any ancestors who emigrated to America. I do have a few cousins in America through a collateral line but I know them all by name. The Americans on my match list are likely to be very distant cousins, if they are related to me at all. Of the matches that I share with my parents all eight of them are in the US.

Clearly MyHeritage need to do some work on the matching algorithms, and I'm sure we will see some improvements in future. For the moment it doesn't seem worth investing too much time in researching these matches.

Comparing admixture percentages
In addition to cousin matching, the MyHeritage test also includes a free admixture report which they call an Ethnicity Estimate. Results are compared with 42 reference populations around the world, and there are plans to add further populations in the future. MyHeritage do not state what time depth their test is designed to cover.

Here are the details of my dad's genealogical ancestry:
  • Four grandparents born in England: Bristol, Gloucestershire, London (x2).
  • Eight great-grandparents born in England: Bristol (x2), Devon, Essex, Gloucestershire, Hertfordshire, London (x2).
  • Fifteen great-great grandparents born in England: Devon (x2), Bristol, Essex, Gloucestershire, Hertfordshire (x 2), London. One great-great grandparent born in Scotland (location not known). The birthplace of seven of his English great-great-grandparents is unknown. Four were probably born in Bristol or in a nearby county. Three were Londoners who could have moved to London from anywhere in England. 
Here are my dad's admixture percentages from MyHeritage.

Here are the details of my mum's genealogical ancestry:
  • Four grandparents born in England: London (x2), Hampshire (x2).
  • Eight great-grandparents born in England: Berkshire, Hampshire, London (x3), Somerset, Wiltshire. The birthplace of one great-grandparent is not known but he was probably born in London.
  • One great-great-grandparent born in County Kerry, Ireland. Fifteen great-great-grandparents born in England: Bedfordshire, Berkshire (x2), Gloucestershire, Hampshire (x2), Hertfordshire, London (x2), Somerset (x2), Wiltshire. The birthplace of three of her English great-great-grandparents is unknown. One was probably born in Hampshire. The other two were probably Londoners who could have come from anywhere in the country.
Here are my mum's admixture percentages from MyHeritage:
Here are my admixture percentages from MyHeritage.
It's good to see that MyHeritage are at least trying to produce regional distributions within the British Isles, even though the results are somewhat off the mark. It's interesting to see that my parents come out with such very different results, despite the fact that they both have predominantly English ancestry. We have no Italian ancestry and the Italian component in the MyHeritage test does not show up in our results in tests with any other company. The admixture reports will no doubt be refined in future as the methodology improves and more reference datasets are added.

Further reading
Related blog posts

Wednesday, 19 July 2017

The end of the road for BritainsDNA and myDNAGlobal

I wrote back in May last year that the BritainsDNA family of companies, which includes ScotlandsDNA, IrelandsDNA, CymruDNAWales and YorkshiresDNA, had been rebranded under the new name of MyDNAglobal after the company was taken over by Source BioScience.

On checking the MyDNAglobal website today I discovered that the company is no longer taking orders. The following notice now appears on the website
Dear Customers 
It is with regret that effective from 3rd July 2017 will no longer be accepting new orders. 
Whilst we have enjoyed offering this individual service it is unfortunately not something we are able to provide going forwards. 
All existing orders will be honoured – if you have recently purchased a test and have yet to return your sample please do so by 31 August 2017 so we can process your results.  
Unfortunately we cannot guarantee that samples received after 31 August 2017 will be processed. 
For those customers who have already received their results these will be available to you via our website until 31 August 2018, after which they will no longer be available. 
If you have any queries please email our support team: 
Thank you for your custom.
If you've tested with any of these companies I would suggest that you download all your data while you have the chance.

For further information on the demise of BritainsDNA and background information on Source Bioscience see the article by Ewan Lamb BritainsDNA - a thing of the past.

Tuesday, 18 July 2017

The GPS origins test - the DREAM chip compared with AncestryDNA and 23andMe transfers

Last November I wrote a review the GPS Origins test in which I was able to compare reports for four people with very different ethnicities, all of whom received disappointing results. However, the reports were all based on transfers of data from 23andMe or AncestryDNA. The GPS Origins test was designed for use with a custom microarray chip known as the DREAM (Diversity of REcent and Ancient huMan). This chip has has over 800,000 markers compared with 700,000+ markers for the AncestryDNA v1 chip and 500,000+ markers for the 23andMe v4 chip.

The DREAM chip was developed by Dr Eran Elhaik who is currently based at the University of Sheffield. In February this year Dr Elhaik gave a presentation at Rootstech about the DREAM chip. I was not at Rootstech, but the handout from the presentation is available online and this provides some technical details about the chip:
DREAM consists of ~800,000 markers: 730,000 autosomal,50,000 X-chromosomal, 18,000 Ychromosomal, and 1,300 mitochondrial markers. DREAM includes unique ancestry informative markers for 500 worldwide populations. It also includes a large number of ancient markers unique to over 300 ancient genomes that allows inferring relatedness to our ancestors (1000 to 50,000 years ago). These powerful markers allows DREAM full compatibility with the Geographical Population Structure Origins (GPS OriginsTM technology. GPS OriginsTM traces the geographical origins of your parental ancestries, down to home village in some cases, trace their migration routes, and date their arrival to these locations. GPS OriginsTM has a time resolution that ranges from 100 to 10,000 years.
In addition DREAM tests around 2,000 genes to "determine ~40 adaptations (e.g., high altitudes) and special traits (e.g., eye color)".

The GPS Origins test does not currently match you with your genetic cousins but it's possible that this feature will added in the future. The chip includes around 400 copy number variants (CNVs) which it is claimed will help to improve the accuracy of relationship predictions for 4th and 5th degree relatives (first cousins and first cousins once removed). It should be noted that the currently available cousin-matching tests from AncestryDNA, 23andMe and Family Tree DNA can already be used to make reliable inferences about relationships up to about the fourth cousin level when the results are used in combination with genealogical information. It may that the use of CNVs is intended to improve inferences when contextual information is not available.

The developer describes DREAM on his blog as "a new microarray that can support concepts that do not yet exist. The difference between DREAM and the old-generation arrays is the same as between smartphones and plain cell phones. They can both make phone calls and text one another, but only smartphones allow running apps. In other words, some of the tests that would be developed on DREAM may work on the old arrays, but not all tests. We’ll do our best to support to all microarrays, of course". (The full blog post can be read here.)

I don't know what the overlap of markers is on the DREAM chip compared with the chips used by AncestryDNA, 23andMe and Family Tree DNA but with additional markers, many of which were specifically selected for biogeographical ancestry, it seems plausible that if a test was done on the chip for which it was designed the results might be much improved. However, it is apparent that many of the problems with this test are related to the methodology, which cannot be replicated and is conceptually unsound. (See my previous review of the GPS Origins test for a fuller discussion of these issues and links to sources.)

Peter Moriarty contacted me after stumbling upon my original review. He has tested on the DREAM chip but he had also previously transferred his raw data to GPS Origins from both 23andMe and AncestryDNA. He has very kindly given me permission to share his reports. This gives us a unique opportunity to compare the results obtained from the DREAM chip with results from AncestryDNA and 23andMe transfers. Here is what Peter says:
Like some of your other contributors I was disappointed with the 1st raw date upload results, which was from my Family Tree results, so I thought I would retry by supplying the raw data from 23andMe. Again the results were disappointing (to say the least), and curiously they show different locations where my DNA apparently first showed a traceable origin. SO, having dug a hole, and having received responses/explanations from GPS Origins that they couldn’t be responsible for raw DNA data from other sources, I jumped in the hole I dug, and ordered a full GPS Origins DNA test. The total costs of these tests was $357.00! So I hope they can be of some benefit to at least expose GPS Origins for what they are.
Here is the migration map that Peter received from his first data upload.

Here is the migration map from Peter's second data upload. Peter does not know which of these maps relate to AncestryDNA and 23andMe and so far the company have not been able to tell him which one is which.

Here are the results that Peter received after being re-tested on the DREAM chip.

Peter also sent me a copy of his Gene Pool percentages which he said were "close to identical from all three test results":


Complete Results

#1 Fennoscandia 20.6% Origin: Peaks in the Iceland and Norway and declines in Finland, England, and France

#2 Southern France 14.5% Origin: Peaks in south France and declines in north France, England, Orkney islands, and Scandinavia

#3 Orkney Islands 12% Origin: Peaks in the Orkney islands and declines in England, France, Germany, Belarus, and Poland

#4 Western Siberia 10.4% Origin: Peaks in Krasnoyarsk Krai and declines towards east Russia

#5 Basque Country 9.5% Origin: Peaks in France and Spain Basque regions and declines in Spain, France, and Germany

#6 Sardinia 8.1% Origin: Peaks in Sardinia and declines in weaker in Italy, Greece, Albania, and The Balkans

#7 Southeastern India 8% Origin: Endemic to south eastern india with residues in Pakistan

#8 Tuva 7% Origin: Peaks in south Siberia (Russians: Tuvinian) and declines in North Mongolia

#9 Northern India 4.3% Origin: Peaks in North India (Dharkars, Kanjars) and declines in Pakistan

#10 Arabia 1.6% Origin: Peaks in Saudi Arabia and Yemen and declines in Israel, Jordan, Iraq, and Egypt

#11 The Southern Levant 1.4% Origin: This gene pool is localized to Israel with residues in Syria

#12 Western South America 0.8% Origin: Peaks in Peru, Mexico, and North America and declines in Eastern Russia

#13 Pima County: The Sonora 0.8% Origin: Peaks in Central-North America and declines towards Greenland and Eskimos

#14 Bougainville 0.6% Origin: Peaks in Bougainville and declines in Australia

#15 Northwestern Africa 0.1% Origin: Peaks in Algeria and declines in Morocco and Tunisia

#16 West Africa 0.1% Origin: Peaks in Senegal and Gambia and declines in Algeria and Morocco

Peter comments on his test results as follows:
My whole and almost only interest in genealogy started as a quest to find out where my Irish Moriarty ancestors lived in Ireland prior to emigrating from Ireland to America. I know the names of the parents of the first ancestor who left arrived in America via Canada in 1961, and am sure they lived in County Kerry, probably on or near the Dingle Peninsula. Of course the 3 autosomal DNA tests contributed little to this quest, so I also took Family Tree’s Y-DNA and mtDNA tests. Interestingly I was contacted by a surname project administrator who told me that I was related to a group of 11 people (so far) who had surnames indicating Irish and English ancestry. They encouraged me to purchase a BigY analysis. I mention all of this because this report indicates that my Irish heritage goes back to at least 365 AD. So this shows, if not proves, that I have Irish ancestry going back at least to that time. The three GPS Origins test results indicate the places where my ancestors’ formations are traceable. As you can see from my GPS Origins results, these locations range from England to Estonia to Switzerland to Sweden to Albania to Georgia and end up in Germany, Russia, Norway, and England! All depending upon which test to believe.

GPS Origins explained away the fact that I don’t show any Irish ancestry results is that their test results probably preceded my records. They also said that probably my maternal and paternal ancestors were from different locations and therefore the GPS Origins results would split the difference and indicate locations somewhere in the middle. Huh? So much for the claim to locate the actual village of origin! Although the paper and historic documentation I have from family records only goes back from 200 years (Irish) and 400 years (German), I believe that my mother was 75% Scotch/Irish + 25% Germanic, and my father was 50% Irish and 50% English, so at least for the past 6 to 10+ generations, they were predominately English/Scotch/Irish. (We also believe there is a little Scandinavian DNA mixed in with the Scotch and perhaps the Irish ancestors), so the GPS Origins results are baffling to say the least.

That having been said, I am only a beginner in understanding DNA. I understand that atDNA tests are good for genealogical research for about 6 generations back, and are also good for describing one’s deep ancestral ethnic makeup. The GPS Origins test results contributed zero to the former, and as far at the latter is concerned, the results may be accurate, but it seems unlikely that my ancestral make up is from such disparate locations as Russia/Siberia (17.4%) and India (12.3%) in addition to Sardinia and Basque Country etc, especially since none of these geographic locations showed up in any of the 3 other autosomal DNA tests that I took, all of which pegged my ancestors as 96-99% Western European!
I should point out that the BigY test Peter took is a Y-chromosome test. The Y-chromosome is passed on from father to son and provides information about ancestry on the direct male line. Y-DNA testing is often used in surname projects because the transmission of the Y-chromosome usually corresponds with the inheritance of surnames. The Y-chromosome doesn't get chopped up like autosomal DNA through the process of recombination and so it can be used to trace male lines back for hundreds or thousands of years.

Autosomal DNA provides information about our ancestors on all our family lines, but because it is diluted with each new generation you only have to go back a few generations before we find ancestors who drop off our genetic family tree. Peter has 64 gggg grandparents, only one of whom was a Moriarty, and so this line represents a tiny fraction of his total pedigree. Although he clearly has deep Irish connections on his Y-DNA line, these results would not be expected to correlate with his genetic ancestry from an autosomal DNA test. In addition, our DNA can only be matched to reference datasets that are in the company's database. If a population is not included then you will be matched to the next closest population. I have been unable to find a full list of the reference populations used by GPS Origins to determine whether or not they have any data from Ireland.

Clearly Peter gained no benefit from being tested on the DREAM chip. In fact the results he received from the full test were even more off the mark than the reports from the transfers. He has paid a hefty price to find this out. Thank you Peter for sharing your results so that others can learn from your experience and will not be tempted to waste their money.

The GPS Origins test was previously sold by DNA Diagnostics Center and had its own dedicated website. The test is now being sold through HomeDNA which appears to be a subsidiary of DNA Diagnostics Center. If you previously tested with the company you will now need to get your account authorised on the new site in order to access your results. The test is currently only sold in the US and Canada.

Within a few hours of publishing this article I was informed by Peter Moriarty that, following a complaint he made to GPS Origins, they provided him with a full refund for all three tests.

Related blog posts