Tuesday, 17 September 2013

My updated ethnicity results from AncestryDNA - a British perspective

AncestryDNA announced last week that they were starting to roll out a free update to their ethnicity results. I noticed today that my updated results were now available. The beta version of AncestryDNA's ethnicity results was widely criticised. Many American customers found that they had much higher percentages of Scandinavian ancestry than expected. As one of the few British customers in the AncestryDNA database I was surprised to find that many of my American friends and genetic cousins had significantly higher percentages of "British" ancestry than me. AncestryDNA also failed to provide any background information on the reference populations used, thus rendering the results essentially meaningless. The new ethnicity results are a slight improvement but, as with all these admixture analyses, still have a long way to go before they can provide any useful information.

When you sign into your Ancestry account you are first of all presented with your old ethnicity results. If you have access to the new ethnicity results you will see a big orange label to click on. As can be seen, my original results from AncestryDNA were 58% Central European, 28% British Isles, 13% European and 4% uncertain.
According to my family history research all my documented ancestors as far back as I can trace them are from the British Isles and predominantly from England. I know the names and birth places of 15 of my 16 great-great-grandparents and they are all English. In this generation I have one illegitimate line which has prevented me from finding out the name of the remaining ancestor. The birthplaces of these 15 great-great-grandparents are: Burrington, Devon; Bristol (2); Thornbury, Gloucestershire; Clapham, London; Colchester, Essex; Sandon, Hertfordshire; Limehouse, London; Bermondsey, London; Merriott, Somerset; Sydenham, Kent; Sydmonton, Hampshire; Kintbury, Berkshire; Westminster, London; Sherston, Wiltshire.

I know the names of 27 of my 32 great-great-great-grandparents, but I only know the birth places of 21 of these ancestors. All of my known ancestors in this generation are again from the British Isles. These are the birth places where known: Ashreigney, Devon; Mariansleigh, Devon; Thornbury, Gloucestershire; Bristol; Great Yeldham, Essex; Preston, Hertfordshire; Sandon, Hertfordshire; Scotland (place not known); Hackney, London; Laverstoke, Hampshire; County Kerry, Ireland; Merriott, Somerset; Rickmansworth, Hertfordshire; Shoreditch, London; Ecchinswell, Hampshire; Welford, Berkshire; Kintbury, Berkshire; Salford, Bedfordshire; Holborn, London; Leighterton, Gloucestershire; Purton, Wiltshire.

The new Ethnicity Estimate 2.0 from AncestryDNA divides the population clusters into 26 global regions. Europe is subdivided into the following regions: Great Britain, Ireland, West Europe, Iberian Peninsula, Finnish/Northern Russia, Italy/Greece, Scandinavia, Europe East and European Jewish. My updated ethnicity percentages from AncestryDNA can be seen below. The percentages are as follows: Europe West 47%, Great Britain 21%, Ireland 20%, Iberian Peninsula 8%, Finnish/Northern Russia 2%, Italy/Greece <1%, Scandinavia <1%.
Ancestry provide somewhat contradictory information on the number of SNPs used for the ethnicity inferences. In their introductory help pages they state that they have increased the number of comparison points (markers) used to determine ethnicity from 30,000 to 300,000. Elsewhere they tell us that they are using "100,000 highly informative SNPs". Your DNA is now analysed more than 40 times to come up with the best estimate and a personalised range. The screenshot below shows the range of results for my "Great Britain" admixture which varied from a low of 0% to a high of 49% in the 40 runs through my DNA. The midpoint of 21% was picked as the best estimate. My results were then compared with "natives" from the region. A "typical native" of Great Britain supposedly has 60% admixture from Great Britain.
Ancestry explain that what they call the "Great Britain region" is "more admixed than most other regions". They provide examples from their reference populations showing the range of results found with percentages varying from 41% to 100% (see the screenshot below). My 21% from Great Britain obviously makes me a very untypical native! However, the only other British person I know who has tested with AncestryDNA has actually come out even less "British" than me with just 10% admixture from Great Britain and 12% from Ireland. In contrast the American genetic genealogy blogger Blaine Bettinger has reported that his Ancestry DNA results show that 55% of his admixture is from Great Britain and 7% is from Ireland. Another American blogger, Judy Russell, who writes the popular Legal Genealogist blog, now finds that, according to AncestryDNA, 49% of her admixture is from Great Britain. I note, however, that the reference population for the "Great Britain region" consists of a mere 195 samples, which is nowhere near adequate to represent the genetic diversity of a population of over 61 million. Ancestry also have a reference population of just 154 people to represent the people of Ireland, and just 416 samples to represent the "Europe West" region which encompasses France, Germany, Switzerland, Austria, the Low Countries, the Czech Republic and northern Italy.
Ancestry also show the percentages from other regions that were found in their Great Britain reference samples:
Ancestry have now provided more details about the reference populations used for their analysis, and have provided a detailed White Paper explaining the methodology behind the calculations. They explain that the reference panel was compiled from "a set of 4,245 DNA samples collected from people whose genealogy suggests they are native to one region". The reference panel candidates included "over 800 HGDP samples, over 1,500 samples from the proprietary AncestryDNA reference collection, and over 1,800 AncestryDNA customers who have explicitly consented to be included in the reference panel". These 4,245 samples were whittled down to provide a final reference panel of 3,000 samples. The 195 samples from Great Britain were reduced to just 111 samples in this process, and the number of samples from Ireland was cut from 154 to 138.

It is not explicitly stated but I presume that the proprietary reference collection is the Sorenson Molecular Genealogy Foundation database which Ancestry acquired in March 2012. The participants in the SMGF database provided their samples for a non-commercial research project and not for use by a large profit-making company. If the SMGF samples were re-analysed by AncestryDNA then they would be ethically obliged to get consent from the participants for the re-use of their data. It is not clear if this has actually happened.

Almost half of the samples used in the AncestryDNA reference panel were provided by AncestryDNA customers. I presume that these are customers who signed the consent form to participate in AncestryDNA's Human Genetic Diversity Project. As I have written previously, I decided not to participate in this project as I could find no published information to describe what the project entailed. I was also concerned at the somewhat deceptive way in which the consent form was muddled up with the standard terms and conditions, potentially allowing people to join the "project" without providing their informed consent. The AncestryDNA test is currently only on sale in the US. I am one of only a handful people outside the US who ordered the test in the beta-testing phase before Ancestry stopped shipping kits overseas. Therefore almost half the so-called reference samples provided for the AncestryDNA test are provided by Americans. This will inevitably introduce biases into the reference samples as the people who emigrated to America will not necessarily constitute a random sample of the population of Europe. For example, disproportionate numbers of people emigrated to America from Ireland. This bias no doubt explains why, in the few results seen so far, British people are coming out with much lower percentages from the "Great Britain region" than their American counterparts. Americans of British origin will no doubt be a good proxy for other Americans of British origin but it makes no sense to use British Americans as a reference population for "native" British people. Ancestry do also make it clear in their White Paper that they had difficulty differentiating the population of Great Britain from the rest of Western Europe. Samples from Great Britain were being "mis-assigned a significant amount of Western European ethnicity" and vice versa. My unexpectedly high Irish percentage is also presumably an artefact of the biased sampling process.

The use of an all-American reference population of AncestryDNA customers also explains the decision to lump England, Scotland and Wales together into one large "Great Britain region", and to mix the Republic of Ireland and Northern Ireland together into one "Ireland" region. It would have been much more interesting to split the British Isles up into the four constituent countries, but Ancestry clearly did not have sufficient samples with detailed genealogies from each country to do this, again because the reference samples were mostly from America rather than the British Isles. This once again calls into question Ancestry's decision to market their DNA test exclusively in the US. As most Americans are very interested in finding out more about their ancestry in Europe you would have thought it would be in Ancestry's interests to make their test available in other countries. This would have the added benefit of bringing in many more customers with four grandparents all born in the same country who could be used to provide more representative reference samples. If the AncestryDNA test is ever launched in other countries there is now going to be very little incentive for non-Americans to test as they will be overwhelmed with large numbers of distant cousins in America with little chance of ever finding the connection and no tools to filter out these large numbers of matches.

Ancestry do not provide detailed information about the timeframe which is covered by the new ethnicity estimates though they do explain that the results are provided as an "estimate of the ancient historical origins" of their customers' DNA. They add that "While this information is less relevant for genealogical research relating to the last five to ten generations, it may reveal intriguing clues about the distant history of one’s ancestors."

Even though my admixture results from the new Ethnicity Estimate 2.0 are no better than the estimates from the old beta test, Ancestry have at least responded to the criticisms and have now given details of the reference populations used and have provided us with a commendably detailed technical White Paper, though I cannot understand why such basic features were not included right from the outset.  It seems to me that AncestryDNA would have been better off investing their time and energy in providing much-needed matching segment data for their customers rather than tinkering with their "ethnicity" results. These admixture tests are still very much in their infancy and they currently have very little practical application for family history purposes. If you want to have some fun with your DNA results you can get alternative "readings" from the many people who provide a free analysis service. For further details see the ISOGG Wiki page on admixture analyses. In the meantime, if you wish to know your "ethnicity" you should carry on researching your family tree in the traditional way using the paper-based records.

© 2013 Debbie Kennett

14 comments:

Your Genetic Genealogist said...

Hi Debbie,

I haven't had a chance to blog about the new AncestryDNA feature since returning from the conference in DC, but to address two of your points with information offered in the conference call held on Thursday with a few bloggers:
1.AncestryDNA is using the Sorenson samples in this new ethnicity feature.
2.They intentionally declined to state a time frame since they feel that it is virtually impossible to accurately do so using reference samples from present day populations (at least at this point in the science). After listening to the genetic academics this past week, I am inclined to agree with them there.

Great coverage! Thanks for the British perspective.

Debbie Kennett said...

Thanks CeCe for the clarification. I hadn't realised that it would be so difficult to establish a timeframe. I hope the academics will eventually be able to provide us with an explanation as to why it is so difficult to do so.

Charles Acree said...

Debbie -

Helpful blog. You might be amused that my latest Ancestry.com estimates are 69% British, 10% Scandinavian, 9% Eastern European and 8% Italian/Greek - with change. Earlier, I was 57% Central European, 27% Scandinavian and 16% Eastern European. From my family history calculations I know that I'm roughly half British and half German, knowing the identities and birthplaces of 30 of my 32 ggg-grandparents. Charles Acree

Debbie Kennett said...

Thanks Charles for sharing your results. It would be amusing if it were not for the fact that so many people take these results very seriously. It's a good job the British government doesn't accept DNA test results as part of a passport application!

Geolover said...

Debbie, I am glad to see your very timely emphasis on the very limited comparison-database.

As you point out, ethnic estimates in most instances are very premature, particularly when phrased in terms of boundaries of modern geopolitical states.

Debbie Kennett said...

Thanks Geolover for your kind comments. There is an interesting timeline map of Europe here which to my mind demonstrates the futility of trying to assign national labels to DNA groups:

http://www.liveleak.com/view?i=14d_1348362692#ZeoDQsxRmqtiF3Vw.01

Anonymous said...

Hi Debbie, I have tried dabbling in genealogy before but couldn't find some Polish/Jewish people in my tree that moved to America (I am in the UK) what with the name changes, little porkies and the Polish record office burning down etc I gave up, I went on Ancestry today and saw their ad and I am very tempted, but from the sounds of it, it is not very accurate... but do you think it would be worth it to me as I would love it if I found some cousins just from the social point of view? Do you know anyone who has actually found a 'new' relative this way? sorry to crash your comments! regards Soleika.

Debbie Kennett said...

Hi Soleika

It would not be worth your while doing the AncestryDNA test. They only sell their kits in the USA anyway.

I suggest you take the Family Finder test with Family Tree DNA which will cost you just £60 at current exchange rates. There are lots of Jewish projects at FTDNA and in fact the company's owners are Jewish. You can find a list of Jewish projects on JewishGen and you can order through them:

http://www.jewishgen.org/DNA/genbygen.html

You might also to read my blog post on autosomal DNA testing which provides some background information:

http://cruwys.blogspot.co.uk/2013/08/autosomal-dna-testing-is-now-affordable.html

The Family Finder test give you matches with your genetic cousins from all over the world so you could end up matching cousins in America or Poland.

The Family Finder test includes a Population Finder component which gives you your "ethnicity" percentages and this tool is due to be updated in the near future.

You might also like to consider the 23andMe test, though they've currently suspended their health reports. They have a larger database than FTDNA but their customers tend to be less interested in genealogy and in making contact. If you want to test with both companies to have your DNA in two different "ponds" then you should test first with 23andMe and then do the transfer to FTDNA:

http://www.familytreedna.com/faq/answers.aspx?id=42

If you only want to do one test then FTDNA is the best option.

If you're in London you might like to come along to WDYTYA where you can get help and advice on DNA testing from the ISOGG stand:

http://cruwys.blogspot.co.uk/2014/01/dna-workshop-schedule-for-who-do-you.html

Anonymous said...

Thankyou very much for your reply, it is very helpful and much appreciated.Soleika

Debbie Kennett said...

Soleika,

It's a pleasure. I've just realised that the 23andMe transfers to FTDNA will no longer work. FTDNA have just introduced a new chip and I understand that FTDNA can't accept transfers for tests done on the new chip. The 23andMe test is still worth doing, and especially if they restore the health reports which I'm sure they will in due course once they've satisfied the FDA.

Anonymous said...

Ancestry has offices all over...but their headquarters are in Utah. Utah has the highest percentage of English ancestry of all the U.S. states. I wonder if that has anything to do with the results being high from Americans? I know this test has been advertised in Utah, and by word of mouth here (I live in Utah), so maybe that's a reason for many Americans having the results you've mentioned.

Also, I'm curious about this test...with so much knowledge out there on haplogroups and their origins, is that not taken into account? If it is...then tracking back your ancestors, even 20 generations, wouldn't give you insight into what haplogroups you're a part of...as the haplogroups could be originated 20,000 years ago.

BTW...I have VERY strong British ancestry. My G-Grandmother came to the U.S. (Idaho) in 1901 from Lancashire (Heywood)...this is on my mother's side. My father's lineage goes directly through the British isles, and has a line directly to Edward the I, King of England (my 19th G-Grandfather).

I bet, in certain parts of Lancashire, I have far more British DNA than the residents...as I've read that many parts of full of newer immigrants to the British Isles. :)

Debbie Kennett said...

Ancestry is using information from customers to inform the analysis and as 99.9% of their database are in America they are comparing people against British Americans not British people. Ancestry are also using the SMGF database and that included disproportionate numbers of people from Utah. That data could well have skewed the dataset. With all these admixture tests we can expect to see improvements as more reference populations are added and the algorithms improve.

AncestryDNA is only looking at autosomal DNA. The haplogroups come from Y-DNA and mtDNA testing, and so are not relevant for autosomal DNA.

terresakane said...

I have 2 sets of gg grandparents and ggg grandparents who migrated from England to Utah in the 1800's after converting to Mormonism in England. So yes I have a higher percent of British DNA.

Debbie Kennett said...

Terraskane, Did your British ancestors intermarry when they arrived in the US? That probably explains why some Americans have these high percentages of "British" ancestry.