Wednesday, 30 November 2016

A review of the GPS Origins test: four ethnicities and four reports

I wrote about the GPS Origins test from DNA Diagnostics Center back in August this year when the test was first launched. There was recently a special offer via Geneabloggers to upload your raw data and receive an interpretation for $29, a big saving on the usual transfer price of $79. I thought I would it give it a try out of curiosity. As a comparison three other people with different ancestries have also shared their reports with me and given permission for me to use them on this blog.

First of all let's have a look at what the GPS Origins test claims to offer. Here are the descriptions from the how it works page:



The GPS Origins report is split into two parts. In the first section you are provided with your gene pool percentages. Here is the explanation of gene pools from the Understanding Your Results page:

The second part of the test provides two migration stories for each customer. Here is the explanation from the Understanding Your Results page:

There is further information about the test on the FAQs page:


The company makes the following claims with regards to the accuracy of the test:


Now let's move on to look at some actual results, starting with my own GPS Origins test.

Debbie Kennett's GPS Origins results
I've done a lot of research on my family tree over the last 15 years. All my known ancestors within the last 500 years are from Britain and Ireland. I have one great-great-great-grandmother who was born in Ireland and one great-great-great-grandfather who was born in Scotland. All my remaining ancestors were born in England and are predominantly from the south and west of the country. I've previously tested with AncestryDNA, 23andMe and Family Tree DNA. My admixture results from these three companies are fairly typical for someone with British ancestry. Each company uses different reference populations and therefore produces different results, but my ancestry comes out at between 41% and 58% British and Irish with the balance made up from a mixture of other European populations. You can see my full admixture results from all three companies here.

For the GPS Origins test I uploaded my raw data from AncestryDNA (v1). Here are my gene pool percentages:

# 1 Fennoscandia 19.8%
Origin: Peaks in the Iceland and Norway and declines in Finland, England, and France

# 2 Western Siberia 12.9%
Origin: Peaks in Krasnoyarsk Krai and declines towards east Russia

# 3 Sardinia 12.4%
Origin: Peaks in Sardinia and declines in weaker [sic] in Italy, Greece, Albania, and The Balkans

# 4 Orkney Islands 11.8%
Origin: Peaks in the Orkney islands and declines in England, France, Germany, Belarus, and Poland

# 5 Southern France 11.3%
Origin: Peaks in south France and declines in north France, England, Orkney islands, and Scandinavia

# 6 Basque Country 11.2%
Origin: Peaks in France and Spain Basque regions and declines in Spain, France, and Germany

# 7 Southeastern India 9.1%
Origin: Endemic to south eastern india with residues in Pakistan

# 8 Tuva 6%
Origin: Peaks in south Siberia (Russians: Tuvinian) and declines in North Mongolia

# 9 Northern India 4%
Origin: Peaks in North India (Dharkars, Kanjars) and declines in Pakistan

# 10 Western South America 1.1%
Origin: Peaks in Peru, Mexico, and North America and declines in Eastern Russia

# 11 Central America 0.2%
Origin: Peaks in Mexico and Central America with residues in Peru

# 12 Northwestern Africa 0.2%
Origin: Peaks in Algeria and declines in Morocco and Tunisia

In the second part of the test I am given a map showing my two migration routes with accompanying migration stories.


You can see an interactive version of my migration routes here. You can view a PDF file with my full GPS Origins report here.

For the blue migration route I am told that my ancestors came from around Croatia prior to 211 AD. They then moved to Ireland at some point before 211 AD and moved to England between 211 AD and 1950 AD. According to my red migration route my ancestors came from Russia prior to 659 AD and arrived in north-western Russia between 659 AD and 1366 AD.

Ann Turner's GPS Origins results
The next report has been shared with me by Ann Turner. Ann's known ancestry is 3/16 German and 1/16 Irish from the early 1800s. The remainder is colonial American, and presumably English. Ann has also tested at 23andMe (v2 and V4), AncestryDNA and Family Tree DNA. Here are her 23andMe results at the speculative setting:


At AncestryDNA Ann's results are: Europe 98% and West Asia 1%. Europe is broken down as follows: Scandinavia 50%, Europe West 15%, Iberian Peninsula 12%, Ireland 12%, Great Britain 6%, trace regions 3%.

With Family Tree DNA's Family Finder test Ann's MyOrigins results are: European 97%, Central South Asia 2%. Europe is broken down into: Western and Central Europe 44%, Scandinavia 35%, British Isles 16%, Southern Europe 1%.

Ann uploaded her AncestryDNA data (v1) to GPS Origins. Here are Ann's gene pools:

# 1 Fennoscandia 22%
Origin: Peaks in the Iceland and Norway and declines in Finland, England, and France

# 2 Southern France 15.5%
Origin: Peaks in south France and declines in north France, England, Orkney islands, and Scandinavia

# 3 Western Siberia 10.9%
Origin: Peaks in Krasnoyarsk Krai and declines towards east Russia

# 4 Southeastern India 10.7%
Origin: Endemic to south eastern india with residues in Pakistan

# 5 Orkney Islands 10.6%
Origin: Peaks in the Orkney islands and declines in England, France, Germany, Belarus, and Poland

# 6 Basque Country 10%
Origin: Peaks in France and Spain Basque regions and declines in Spain, France, and Germany

# 7 Sardinia 9.3%
Origin: Peaks in Sardinia and declines in weaker in Italy, Greece, Albania, and The Balkans

# 8 Tuva 7%
Origin: Peaks in south Siberia (Russians: Tuvinian) and declines in North Mongolia

# 9 Northern India 2.5%
Origin: Peaks in North India (Dharkars, Kanjars) and declines in Pakistan

# 10 The Southern Levant 1.4%
Origin: This gene pool is localized to Israel with residues in Syria

# 11 Western South America 0.2%
Origin: Peaks in Peru, Mexico, and North America and declines in Eastern Russia

Here are Ann's migration routes.



You can see an interactive version of Ann's migration routes here. The PDF File with Ann's full GPS Origins report can be seen here.

Ann's migration stories show that her ancestors came from Greece prior to 696 AD, and from Russia prior to 696 AD. Both of Ann's routes converge on the same location in Germany some time between 696 AD and 1935 AD. 

Piya Changmai's GPS Origins results
From the above two results it would appear that this test is not very helpful for people of Northern European ancestry. Let's now have a look at some results for someone with Asian ancestry. Piya Changmai has kindly shared his results with me. According to his family history Piya has 5/8 of his ancestry from Thailand and 3/8 from Southern China. He describes his ancestry as follows:
I have a Chinese paternal great-grandfather, so he contributed 1/8 of my ancestry. I have also a Chinese maternal grandmother, so her contribution is 2/8. Other ancestors are Thai and Laotian ethnics from Thailand. Thai and Laotian are closely related ethnics, like Czech and Slovak. In summary, I have Chinese ancestry 2/8+1/8 = 3/8 and Thailand (Thai and Laotian) ancestry 5/8. Both Chinese ancestors are from Southern part of China, also reflected by Y and mt haplogroups (O2a1a and F4b, respectively).
Piya has also tested at 23andMe. Here are his 23andMe results at the standard setting:


Here are Piya's 23andMe results at the speculative setting:


Piva uploaded his 23andMe (v4) data to GPS Origins. Here are Piya's gene pool results:

# 1 Austronesian Oceania 33.4%
Origin: Peaks in Korea, Chinese (Han), Mynamar, Japan, and Vietnam and declines towards West China and India

# 2 Austronesian Southeast Asia 27.1%
Origin: Peaks in Taiwan and Malay and declines in Thailand, Vietnam, Cambodia, and South China

# 3 Central America 6.4%
Origin: Peaks in Mexico and Central America with residues in Peru

# 4 Sino-Tibetan and Hmongic Southeast Asia 5.8%
Origin: Peaks in East Asia, Central-south China (Lahu, Naxi, Yi) and declines towards India

# 5 Tuva 4.2%
Origin: Peaks in south Siberia (Russians: Tuvinian) and declines in North Mongolia

# 6 Central Southern China: Yunnan and Guangxi 4%
Origin: Peaks in East Asia (East) and Chinese (She, Dai) with residues in Central south China (Han, Miao, Tujia)

# 7 Western Siberia 3.2%
Origin: Peaks in Krasnoyarsk Krai and declines towards east Russia

# 8 Pima County: The Sonora 3.1%
Origin: Peaks in Central-North America and declines towards Greenland and Eskimos

# 9 Southeastern India 2.9%
Origin: Endemic to south eastern india with residues in Pakistan

# 10 Papuan New Guinea 1.8%
Origin: Peaks in Papua New Guinea and declines in Australia

# 11 Bougainville 1.6%
Origin: Peaks in Bougainville and declines in Australia

# 12 Southern France 1.3%
Origin: Peaks in south France and declines in north France, England, Orkney islands, and Scandinavia

# 13 Southwestern India 1.3%
Origin: Endemic to Indian (Pulayar) with residues in India (Paniya, Savara, Bengali, Juang, Savara, Ho, Bonda)

# 14 Northern India 1.1%
Origin: Peaks in North India (Dharkars, Kanjars) and declines in Pakistan

# 15 Western South America 1%
Origin: Peaks in Peru, Mexico, and North America and declines in Eastern Russia

# 16 Northern Mongolia and Eastern Siberia 1%
Origin: Peaks in North Mongolia and declines in Siberia

# 17 Northwestern Africa 0.5%
Origin: Peaks in Algeria and declines in Morocco and Tunisia

# 18 The Southern Levant 0.3%
Origin: This gene pool is localized to Israel with residues in Syria

Here is Piya's migration map.

You can see an interactive version of Piya's migration map here. (His report is under the pseudonym Mee Panda.) A PDF file with Piya's full GPS Origins results is available here.
Both of Piya's migration routes start in the same place in Kyrgyzstan. Piya is told that his ancestors came from Kyrgyzstan prior to 1183 AD. His ancestors on the northern route arrived in northern China between 1183 AD and 1617 AD. Piya's southern migration route ends up in Singapore and arrived there some time between 1150 AD and 1751 AD. 
Ezgi Altinisik's GPS Origins results
The final set of results I'll be looking at are from Ezgi Altinisik. She is from Turkey. Her paternal grandfather was born in Bulgaria and her paternal grandmother was born in Romania but both were Turkish and moved back to Anatolia around 1930. Her maternal grandmother is from Siverek in Turkey. Her maternal grandfather is from Samsun on the north coast of Turkey. As far as she knows, all her maternal ancestors have resided in Turkey for a long time.

Ezgi has also tested at 23andMe. Here are her 23andMe results at the standard level:


Here are her 23andMe results at the speculative level.


Ezgi uploaded her 23andMe data (v4) to GPS Origins. Here are Ezgi's gene pool results:

# 1 Southern France 14.6%
Origin: Peaks in south France and declines in north France, England, Orkney islands, and Scandinavia

# 2 Fennoscandia 14.6%
Origin: Peaks in the Iceland and Norway and declines in Finland, England, and France

# 3 Southeastern India 12.8%
Origin: Endemic to south eastern india with residues in Pakistan

# 4 Western Siberia 10.9%
Origin: Peaks in Krasnoyarsk Krai and declines towards east Russia

# 5 Orkney Islands 9.7%
Origin: Peaks in the Orkney islands and declines in England, France, Germany, Belarus, and Poland

# 6 Tuva 7.9%
Origin: Peaks in south Siberia (Russians: Tuvinian) and declines in North Mongolia

# 7 Sardinia 7.8%
Origin: Peaks in Sardinia and declines in weaker in Italy, Greece, Albania, and The Balkans

# 8 Arabia 5.7%
Origin: Peaks in Saudi Arabia and Yemen and declines in Israel, Jordan, Iraq, and Egypt

# 9 The Southern Levant 5.5%
Origin: This gene pool is localized to Israel with residues in Syria

# 10 Basque Country 3.9%
Origin: Peaks in France and Spain Basque regions and declines in Spain, France, and Germany

# 11 Northern India 3.8%
Origin: Peaks in North India (Dharkars, Kanjars) and declines in Pakistan

# 12 Austronesian Southeast Asia 1.3%
Origin: Peaks in Taiwan and Malay and declines in Thailand, Vietnam, Cambodia, and South China

# 13 Central America 0.8%
Origin: Peaks in Mexico and Central America with residues in Peru

# 14 Western South America 0.8%
Origin: Peaks in Peru, Mexico, and North America and declines in Eastern Russia

Here is Ezgi's migration map.


Ezgi's interactive migration map can be seen here. A PDF file with Ezgi's full GPS Origins results is available here.

The blue migration route shows that Ezgi's ancestors came from Russia prior to 1244 AD. Her ancestors then passed through Turkey on their DNA journey and ended up in Crete some time between 1244 AD and 1557 AD. According to the red migration route Ezgi's ancestors came from Turkey prior to 1037 AD, and arrived in Armenia some time between 1037 AD and 1527 AD.

Discussion
This is only a very small sample of four test results, but if these results are representative it would appear that the GPS Origins test is not very helpful.

The gene pool results are very strange and correlate poorly with the results we might expect for the reported ethnicities. For instance, both North European persons (Ann and I) and the Turkish person (Ezgi) in this small sample have unexpectedly high and very similar percentages of ancestry components from Siberia (18% - 19%) and India (13 - 17%). The model seems to overestimate these components for all West Eurasians. The components from Orkney (9.7% - 11.8%) and Sardinia (7.8%  - 12.4%) are also similar in these three individuals.

The Thai person (Piya) has an unexpectedly high component of around 10% Native American, but only just over 5% from India. Both Ann and I, who have recent all-European ancestry, came out with more than double this Indian component. We would expect much more Indian ancestry in a Thai person as compared to a European person, based on the history of Thailand and recent genetic research (Mörseburg et al 2016).

The maps do not always correspond with the countries in the gene pools. Fennoscandia is supposed to encompass Norway, Sweden, Finland, Denmark and "a part of Russia known as the Kola Peninsula". However, on the map it covers Iceland, Norway, Finland, Britain and France but excludes Sweden and Denmark. The map for the Western Siberian gene pool covers the whole of Russia. The Austronesian Oceania gene pool seems to be misnamed given its geographical distribution and probably should be renamed as Northeast Asia. Only Korea, Japan, Vietnam and Myanmar (Burma) are highlighted on the map yet these countries are not in Oceania and the people do not speak any Austronesian languages. China is missing from the map, although the text states that the component peaks in Han Chinese, among other populations. Pavel Flegontov, a geneticist at the University of Ostrava in the Czech Republic, tells me that in all other ADMIXTURE analyses he's seen, this Northeast Asian component has a much wider distribution in Siberia and much lower percentages in Myanmar and Vietnam. He suggests that if the components really have the distributions shown on the maps, that clearly demonstrates that GPS Origins reports artefacts of an overly complex admixture model with 36 components.

According to the legend on the migration maps "Although the Migration Patterns represent your Maternal and Paternal DNA route, we cannot differentiate which route is specifically your parents’ individual route at this time." However, the GPS Origins test does not phase the genetic data (phasing is the process of sorting the alleles onto the maternal and paternal chromosomes) so it is not clear how the paternal and maternal routes are defined in the first place. If an individual has reported ancestry from predominantly one region then surely we would expect the migration routes to be broadly similar for both the maternal and paternal lines.

The co-ordinates are supposed to represent places where "significant genetic mixture took place at the gene pool level", but the proposed migration routes are at times bizarre and do not correspond with historical records. For example, there are no large-scale historical migrations from Croatia to Ireland, from Kyrgyzstan to Singapore, or from Russia to Crete. The precision of the geographical co-ordinates down to three decimal places gives a false sense of accuracy, but the methodology is opaque.

The concept of the dual migration pathways is difficult to understand. If we go back one thousand years, in theory we all have over 8,000 million genealogical ancestors. It is inconceivable that half of these ancestors would all go off on their travels in one direction and the other half would go in a different direction.

The algorithms for the GPS Origins test have been developed by scientists at the University of Sheffield led by Eran Elhaik. However, as mentioned in a previous blog post, the underlying research by Elhaik et al (Nature Communications, 2014) on which this test is based has proved to be controversial. The results have been called into question by Flegontov et al (2016) who conclude that GPS is a "genetic provenancing approach" which is "at best only suited to inferring the most likely geographic location of modern and relatively unadmixed genomes, and tells nothing of population history and origin".

Since then a further analysis has been published by Andrew Millard, an archaeological scientist at the University of Durham. He was unable to reproduce the mathematical calculations and concluded:
...the mathematical methods described are incoherent, the supplementary data is not that used to create the figures or equations in the paper, and the supplementary code does not implement the methods described. The paper is methodologically unsound and not reproducible.
There have also been additional concerns about an undeclared conflict of interest on the part of Eran Elhaik and Tatiana Tatarinova, the lead authors of the GPS paper in Nature Communications. This omission has now been partially rectified, somewhat belatedly, with the publication on 31 October 2016 of a corrigendum. However, the new conflict of interest statement does not mention the relationship that the two authors already appear to have had in place with Prosapia Genetics prior to publication. The Prosapia Genetics domain name was originally registered to Tatarinova. On the very day that the paper was released Prosapia started selling a commercial GPS test. In a video published to accompany the press release issued by the University of Sheffield Eran Elhaik suggested that people should upload their genotype data to "our website" to find out their geographical homeland. The Prosapia URL (www.prosapiagenetics.com) was included at the end of this video. The video has since been edited to remove the URL but the original unedited video can be viewed on the Daily Mail website. The original Prosapia GPS test no longer seems to be available and the website now returns a warning message. An early version of the website dating from 3 May 2014, a few days after the publication of the paper, can be found in the Internet Archive.

Conclusion
The GPS Origins test does not provide meaningful results and has no practical application for the genetic genealogist. If you wish to use your raw autosomal DNA data from one of the commercial testing companies to get an alternative admixture analysis I recommend using one of the free services such as DNA.Land or GedMatch instead.

Update 1st December 2016
Eran Elhaik has published a response to this article GPS Origins results for four participants on his Khazar DNA Project blog.

Update 8th December 2017
The GPS Origins test is now marketed by a company known as Home DNA. See the article Genome culture: a holiday gift-giving guide by genetics counsellor Laura Hercher one the deficiencies of the Home DNA privacy policy and the limitations of their tests.

Acknowledgements
Thanks to Ann Turner, Piya Changmai and Ezgi Altinisik for sharing their results. Thanks to Pavel Flegontov and Ann Turner for helpful comments on early drafts of this blog post.

Related blog posts
 © 2016 Debbie Kennett

10 comments:

Seaton Smithy said...

At least Ann's migration paths end up in the same place - unlike the others!

Thanks for the thorough review of GPS Origin's (not unexpected) unreliability.

Debbie Kennett said...

Strangely enough there was someone in the ISOGG Facebook group who had a migration route that ended up in exactly the same place as Ann's too - the co-ordinates were identical.

liz ewing said...

Hi Debbie

I am just wondering when 450 reference populations will be added as there seems a best a couple of dozen of reference populations! When will be more be added?

Also is the place on gps origins you can go and see what your results is compared to people from different geographical areas?

Kind Regards

Liz Ewing

Debbie Kennett said...

I understood from the website that those 450 or 500 reference populations were already included in the test and that those are the populations used to create the gene pools.

There is no facility on the GPS Origins website to compare you results with other people.

Kathryn Creeden said...

My results didn't make much sense to me. My migration routes also ended up in Berlin like Ann's. According to Ancestry and FTDNA, i have somewhere between 50-73% British Isles ancestry, but that was nowhere to be found in the migration routes.

I got this note from their customer service when I questioned the results:

Gene pools are ancient history (50,000 years ago) and migration patterns are more recent history (2,000-3,000 years ago). They don’t always match.

Like you, I'm glad I got this on the Geneabloggers special and didn't pay full price!

Debbie Kennett said...

If we go back 50,000 years then we all have a common set of ancestors so the concept of gene pools from this time depth doesn't make any sense at all. I can't make any sense of the migration routes either. However, the AncestryDNA percentages can't be relied on either. I'm only 21% "British" according to AncestryDNA despite the fact that all my identifiable ancestors were born in Britain in the last 500 years or so.

Kathryn Creeden said...

I agree, the 50,000 years didn't make sense and really has no use for anyone interested in their ancestry from the past few hundred years.

Ancestry said I had 73% British and only a small amount of Western Europe. My father was 75% German, so that was a bit suspect. FTDNA said I had about 50% British and 23% Western Europe which made more sense. I think the ethnicity charts are fun to look at, but I wouldn't rely on any of them. From the GPS Origins website, I was hoping for charts along the same lines as the other companies, but maybe with a bit more detail on locations. The results weren't what I was expecting at all.

Debbie Kennett said...

None of the companies can reliably distinguish between French, German and British DNA. These country percentages appear in varying proportions in anyone of Northwest European ancestry. Dave Dowell did an interesting blog post comparing his wife's results with mine. His wife has five great-grandparents from Germany and all my eight great-grandparents were born in England. However you wouldn't think so from the DNA results!

http://blog.ddowell.com/2015/05/a-year-ago-i-blogged-about-paucity-of.html


Peter Moriarty said...

I read Debbie's review of the GPS Origins test published last November, and thought that the readers might be interested in a similar experience that I am currently going through. I have autosomal test results from Ancestry, 23andMe, and FamilyTree. In addition I have yDNA and mtDNA results from FamilyTree and bigY. The autosomal services split the pie in different ways but they all indicate that my ancestors are about 97% Western European. If I interpolate them, this breaks down into 60% Irish/Scotch; 25% English; and 15% Germanic. I uploaded my Family Tree an 23andMe raw data to GPS Origins, and like review, got wildly different results from gps indicating that we are only 55% Western European plus a mix of Russia, India, Sardinia, Siberia, Middle Eastern, and Orkney Island DNA that seems highly unlikely given the geographic dispersion of the results vs known human migration patterns. So, I forked out the $199 to take the gps Origins test, hoping t get better results, but in fact received almost identical results. However, the results of the 3 packages of raw data as interpreted by gps are quite different with respect to where my traceable DNA first formed and to where it migrated. In addition, I have more than anecdotal information that some of my ancestors lived in Ireland for at least 1000-1500 years. This does not show up anywhere in the gps Origins results. Their explanation is that maybe the gps Origins results pre-dates our Irish ancestry, which doesn't make a lot of sense to me, because I thought that the Irish DNA is pretty distinguishable because there hasn't been a lot of inter-nationality mixing in Ireland, especially SW Ireland where we are from. In addtion, this does not explain the wide and unlikely dispersion of the 45% of our ancestry that according to gps Origins is non-Western European. I have the images of these results that I can supply but I can't figure out how to attach them. peterhmoriarty@gmail..com

Debbie Kennett said...

Thank you Peter for sharing your results and for being a guinea pig. I'd wondered if it would make any difference if the test was done on the GPS chip rather than just doing a transfer but clearly it had little impact. I don't believe GPS Origins have any samples from Ireland in their reference dataset. You can only be matched with populations if they are included in the dataset.

You might find this lecture presented by Garrett Hellenthal at WDYTYA Live of interest as it explains how the process works:

https://youtu.be/iGUhMs0Ttls