Thursday, 21 May 2015

The business of genetic ancestry


There is a programme going out next Tuesday 26th May on BBC Radio 4 on the "business of genetic ancestry". The programme is presented by Dr Adam Rutherford and includes interviews with Professor Mark Jobling from the University of Leicester, my colleague Professor Mark Thomas from University College London and yours truly! You can find further details about the programme here:

http://www.bbc.co.uk/programmes/b05vy4kb

The programme will also be available on the iPlayer and it will be repeated on Monday 1st June at 9pm.

I paid a visit to New Broadcasting House at the end of last month to record my interview with Adam. I was interviewed for one hour and 20 minutes, and we had a wide-ranging conversation about the exciting discoveries that can be made from DNA testing but also about some of the problematic press coverage and the dubious claims made by certain companies. Obviously only a tiny fraction of what I said will make the final cut, so I shall be interested to see how it turns out.
In the recording studio with Adam Rutherford.
I have been very impressed with the way that the programme has been made. The BBC have done a lot of research behind the scenes and have gone to great pains to talk to a wide range of people and to present a balanced view showing both the benefits of genetic ancestry testing but also highlighting some of the problematic areas. They invited people from all sides of the debate to contribute to the programme though I understand that unfortunately not everyone who was invited to participate chose to do so.

If you listen to the programme do let me know what you think.

Related blog posts
- Driving in the wrong direction with a dodgy DNA satnav
- More on the S4C DNA Cymru controversy and my review of "Who are the Welsh?"


© 2015 Debbie Kennett

Sunday, 17 May 2015

Three generations of FTDNA MyOrigins admixture results

I wrote last week about comparing my admixture results from Ancestry, 23andMe and Family Tree DNA. At Family Tree DNA I have now tested three generations of my family so I thought it would be interesting to compare the MyOrigins results across the generations. I've provided below a summary of the ancestry for each person tested together with a screenshot of their results. Click on the images to enlarge them.

Debbie's dad
Four grandparents born in England: Bristol, Gloucestershire, London (x2).

Eight great-grandparents born in England: Bristol (x2), Devon, Essex, Gloucestershire, Hertfordshire, London (x2).

Fifteen great-great grandparents born in England: Devon (x2), Bristol, Essex, Gloucestershire, Hertfordshire (x 2), London.
One great-great grandparent born in Scotland (location not known).
The birthplace of seven of his English great-great-grandparents is unknown. Four were probably born in Bristol or in a nearby county. Three were Londoners who could have moved to London from anywhere in England.


Debbie's mum
Four grandparents born in England: London (x2), Hampshire (x2).

Eight great-grandparents born in England: Berkshire, Hampshire, London (x3), Somerset, Wiltshire. The birthplace of one great-grandparent is not known but he was probably born in London.

Fifteen great-great-grandparents born in England: Bedfordshire, Berkshire (x2), Gloucestershire, Hampshire (x2), Hertfordshire, London (x2), Somerset (x2), Wiltshire.
One great-great-grandparent born in Ireland: County Kerry.
The birthplace of three of her English great-great-grandparents is unknown. One was probably born in Hampshire. The other two were probably Londoners who could have come from anywhere in the country.


Debbie
Four grandparents born in England: Bristol, London (x3).

Eight great-grandparents born in England: Bristol, Gloucestershire, Hampshire (x2), London (4),

Sixteen great-great-grandparents born in England: Berkshire, Bristol (2), Devon, Essex, Gloucestershire, Hampshire, Hertfordshire, London (x 5), Somerset and Wiltshire.
The one great-great-grandparent with an unknown birth location was probably born in London.

Twenty-four great-great-great grandparents born in England: Berkshire (x2), Bristol, Devon (x2), Essex, Gloucestershire (x2), Hampshire (x2), Hertfordshire (x3), London (x5), Somerset (x2), Wiltshire.
One great-great-great grandparent born in Ireland: County Kerry
One great-great-great grandparent born in Scotland (location not known).
The birthplace of the remaining eight English great-great-great-grandparents is unknown but they were probably born in Bristol, London and Hampshire.


Debbie's husband
Four grandparents born in England: Cambridgeshire (x2), Cumberland, Devon.

Eight great-grandparents born in England: Cambridgeshire (x3), Devon (x2), Dorset, Somerset, Surrey.

Sixteen great-great grandparents born in England: Cambridgeshire (x3), Devon (x4), Hampshire, Herefordshire, Hertfordshire, Huntingdonshire (x2), Somerset (x2), Surrey (x2). 

Twenty-six great-great-great grandparents born in England: Cambridgeshire (6), Devon (x8), Hampshire, Herefordshire (x2), Huntingdonshire, Somerset (x4), Surrey (x3), Sussex.
The birthplace of his remaining six  English great-great-great-grandparents is unknown. Three were probably born in Cambridgeshire, two in Hertfordshire and one in Surrey.


Debbie's eldest son


As can be seen, there is considerable variation between family members. This is only to be expected because of the random nature of DNA inheritance. However, some of the differences are somewhat more extreme than might be intuitively expected. For example, 57% of my DNA matches the British Isles cluster whereas only 40% of my dad's DNA matches the British Isles and only 7% of my mum's DNA.

My dad, my husband and my son all come out with smaller percentages of "Middle Eastern" DNA. My husband and son's "Middle Eastern" DNA appears on the map over Turkey, Georgia and Azerbaijan whereas my dad's supposedly Middle Eastern DNA is centred over Egypt and Jordan. I've noticed that a significant proportion of the members of my Devon DNA Project with predominantly British ancestry are coming out with these small percentages of "Middle Eastern". Clearly this does not mean that any of them have recent ancestry from the Middle East, and it is probably related to the limitations of the available reference populations. I hope to look at the Middle Eastern issue in more depth in a subsequent blog post.

As I mentioned in my previous blog post, these admixture results should really only be used for entertainment value at present. However, the results are likely to change over time as more reference populations become available and as the methodology improves. Admixture tests should really be regarded as a bonus feature of an autosomal DNA test and should not be the primary purpose for testing. While your admixture results will not tell you whether your great-great-grandfather was Scottish or Irish, you might find instead that you match with a cousin who is descended from the ancestor of interest who will be able to fill in the blanks in your family tree. Cousin-matching tests will become increasingly useful as the databases continue to grow in size.

Related blog posts

© 2015 Debbie Kennett

Saturday, 16 May 2015

Comparing admixture results from AncestryDNA, 23andMe and Family Tree DNA

I have now taken an autosomal DNA cousin-matching test at all three testing companies – 23andMe, AncestryDNA and Family Tree DNA. With this type of test you also get as bonus feature a report of your admixture percentages. I thought it would be a useful exercise to do a comparison of my admixture results from all three companies.

All my known ancestors on all my lines within the last 500 years are from the British Isles. Here is a breakdown on a generation by generation basis:

- All four of my grandparents were born in England. One grandparent was born in Bristol, and my other three grandparents were born in London.

- All eight of my great-grandparents were born in England. Four of my great-grandparents were born in London, two were born in Hampshire, one was born in Bristol and one was born in Gloucestershire.

- I know the birthplaces of 15 of my 16 great-great grandparents and they were all were born in England in the following locations: Berkshire, Bristol (2), Devon, Essex, Gloucestershire, Hampshire, Hertfordshire, London (x 5), Somerset and Wiltshire. My great-great grandparent with an unknown birth location was very likely to have been born in London.

- I know the birthplaces of 24 of my 32 great-great-great grandparents. I have one ggg grandmother who was born in Ireland, and one ggg grandfather who was born in Scotland. My other ggg grandparents were all born in England in the following locations: Bedfordshire, Berkshire (x2), Devon (x2),  Bristol,  Essex, Gloucestershire (x2), Hampshire (x2), Hertfordshire (x3), London (x5), Somerset (x2), Wiltshire. My other eight ggg grandparents are all most likely to have been born in England, probably in Bristol, London and Hampshire.

I am probably fairly typical of someone with ancestry from the south and west of England whose ancestry has been filtered through the melting pot of London.

Here is my Ethnicity Estimate from AncestryDNA. According to the AncestryDNA FAQs (Interpreting my results Q5) the test can "reach back hundreds, maybe even a thousand years, to tell you things that aren't in historical records". In the Ethnicity Estimate White Paper AncestryDNA caution that "Genetic estimates of ethnicity also go back thousands of years, beyond the end of a pedigree paper trail. Regions identified as “populations” in a pedigree may have been very different thousands of years ago, and so may be represented differently in a genetic ethnicity estimate."


Here is the MyOrigins report from Family Tree DNA. The timeframe for the genetic clusters is not given but in the MyOrigins White Paper it is stated that the clusters "span extant modern human genetic variation" but are also "reflective of ancient migrations and admixtures".


23andMe provide the most sophisticated tools. They offer three different Ancestry Composition reports - conservative, standard and speculative. They tell us that these results "reflect where your ancestors lived before the widespread migrations of the past few hundred years".

Here is my conservative estimate:


Here is my standard estimate:


Here is my speculative estimate:


23andMe also provide a chromosome view which shows the breakdown of your admixture across all your different chromosomes. Here is my chromosome view in the speculative mode:

As can be seen, these admixture percentages bear little resemblance to my documented pedigree, and when the companies try to break down Europe into individual countries they come out with quite variable results. It is perhaps only to be expected considering that there is a very limited range of reference populations available. The companies all supplement the publicly available datasets by using samples from their customer databases, but they still only have a very small number of samples from the British Isles. Here is a list of reference samples for Britain and Ireland for each of the three companies.

AncestryDNA reference samples
Great Britain  111
Ireland           138
Source: The AncestryDNA Reference Panel (version 2.0) (available to AncestryDNA customers)

Family Tree DNA MyOrigins reference samples
British      39
Irish         45
Scottish   43

23andMe reference samples
.













As can be seen, both Ancestry and Family Tree DNA have very small sample sizes from the British Isles. They have not made any attempt to split the samples into constituent countries. Northern Ireland would be expected to be genetically very similar to Scotland but we don't know if Ancestry's Irish samples are from the north or the south of Ireland or from both countries combined. We don't where in Great Britain their samples were taken from. Family Tree DNA seem to think that Scotland has already separated from Britain and is a country in its own right! In view of this, it is not clear if their British samples also include people of Scottish ancestry or if they now think that Britain only consists of England and Wales. It is also not known if their Irish samples are for people with ancestors from the whole of Ireland or just from the Republic of Ireland.

23andMe have the benefit of a larger dataset but this has not improved the accuracy of their reports, and they have a confusing grasp of geography. They label a cluster as "British and Irish" but describe samples collected from the UK and Ireland. Do they realise that Northern Ireland is part of the United Kingdom? One wonders if customers with ancestors from Northern Ireland described themselves as from Ireland or the UK. It would have made more sense to ask people to define which country within the British Isles their ancestors came from rather than providing two confusing and overlapping options.

In view of these limitations it is therefore not surprising that we often see some bizarre results. For example, it is often the case that Americans come out with much higher percentages of "British" ancestry with these tests than "native" Brits like me. Americans sometimes have surprisingly high percentages of "British" of 80% or more.

The lack of defined reference samples from specific countries within the British Isles also sometimes gives confusing results. I have one project member with seven of his eight great-grandparents born in Wales and one great-grandparent born in Devon. At AncestryDNA he comes out as 64% Irish and 12% Great Britain, 12% Scandinavia and 11% Trace Regions. At Family Tree DNA his ancestry is reported as being 97% from the British Isles and 3% from Finland and Siberia.

There are no doubt problems with the sampling in other countries too which produces similarly misleading results. Joss ar Gall, who writes the Le Gall of Lower Britanny blog, is French and all his ancestry is from Britanny yet, according to his 23andMe test, he is only 19% French. AncestryDNA assigned him with no French DNA at all but found that he was 46% British and 10% Irish. One would expect many similarities between the French and the English but clusters which clearly cross country borders should not be labelled so specifically because people are misled and take the labels too literally.

Admixture tests really need to be used for entertainment purposes only at the present time, and the results should be taken with a very large pinch of salt. However, the tests can sometimes provide useful insights. Generally it is possible to distinguish between populations at the continental level (eg Asian, African and European) provided you're from a population that is not close to a continental border. Admixture from endogamous populations such as Ashkenazi Jews and Finns can also be detected with reasonable confidence. However, it is not possible to distinguish between populations within individual European countries and it may never be possible to do so because our ancestry is so complicated.

Population level comparisons
While admixture results at the individual level are not particularly meaningful there is much more insight to be gained when the results of these tests are compared at the population level.

AncestryDNA have published a few very interesting blog posts with some nice maps comparing the admixture percentages of their British and Irish testers:

- What does our DNA tell us about being Irish by Mike Mulligan, Ancestry blog, 16 March 2015
Exploring our DNA – Europe West by Mike Mulligan, Ancestry blog, 10 April 2015
- AncestryDNA - The Viking in the room by Mike Mulligan, Ancestry blog, 23 June 2015

AncestryDNA also did a similar exercise with their American testers and produced a genetic census of America with a range of maps showing the contribution of the different admixtures to the American population.

Ancestry also provide a useful bar chart which is hidden away in their help menu showing the differences between the various clusters. The chart below shows the differences between European regions. To access the chart click on the question mark in the top right of your screen from your ethnicity estimate page to open up the help and tips menu. Then click on "Why you might have more (or less) from a certain region". There is also a chart which will give you the breakdown for all 26 clusters.


A group of 23andMe scientists published a fascinating paper earlier this year in the American Journal of Human Genetics on The genetic ancestry of African Americans, Latinos, and European Americans across the United States (Bryc, Durand, Macpherson et al 2015).

The future
While these admixture tests will probably never give us all the answers we want, they will no doubt improve over time as better reference samples become available. We are already on the second incarnation of these tests at all three companies and we can expect to see many more improvements in the years to come. I would hope that all three companies will eventually be able to have access to the dataset from the People of the British Isles Project which should give improved estimates for people of British ancestry. It would also help if the reference samples were collected more carefully and with precise countries of origin clearly defined.

This article was updated on 17th May 2015 to include the screenshot of the bar chart from AncestryDNA showing the admixture breakdown within Europe. The article was updated on 18th May to include a mention of AncestryDNA's genetic census of America. The article was updated on 1st January 2017 to include a link to an AncestryDNA blog post on the percentages of Scandinavia DNA found in British and Irish testers.

FURTHER READING
My related blog posts
23andMe
AncestryDNA Ethnicity Estimate
FTDNA MyOrigins
© 2015 Debbie Kennett

Wednesday, 6 May 2015

New Y Elite 2.0 test from Full Genomes Corporation

The Full Genomes Corporation have announced the launch of a new Y-chromosome next generation sequencing product known as the Y Elite 2.0.

The technical details are as follows:

Length coverage: 13.2+ megabases (a conservative estimate)

Read length: 250 base pairs

Coverage: 30x

Supplier: Omega Bioservices 

Features: better SNP calling and better STR calling quality

Cost: US $750

This new product replaces the earlier Y Elite test sourced through BGI Genomics and the Y Prime (100 bp) test. It is the only commercial next generation sequencing test which offers a 250 bp read length, and is also the only NGS test which includes mtDNA results.

For further details contact the company via their website:

https://www.fullgenomes.com

For information on the currently available SNP tests see the Y-DNA SNP testing chart in the ISOGG Wiki. The chart has not yet been updated to include the new Y Elite 2.0 test but should be updated in the next few days.

With thanks to Justin Loe of Full Genomes Corporation.