Wednesday 12 December 2012

Genographic results from the UK

The first results from Geno 2.0, the new DNA test from the Genographic Project, are now starting to appear. A genetic genealogy friend in the UK has very kindly agreed to share screenshots of his results with me for publication on this blog. One of his parents is English and the other is from the Philippines so he has some very interesting results. Each participant is a given a very cool infographic summarising their results which they can share with their friends.
These are the pages which tell the personal genetic story of the participant.

The two reference populations with which this participant most closely matches are Vietnam and Romania. These seem rather odd selections and don't match his documented ancestry from England and the Philippines, but perhaps there are insufficient reference populations in the database to give accurate matches. 
This close up provides details of the British reference population used by the Genographic Project.
A fun part of the test is that you are told your percentages of Neanderthal and Denisovan ancestry.
For the Y-DNA results you get a nice map showing the migratory path of the different branches of the Y-DNA tree. This is the map showing the path of  U106, one of the major branches of the R1b tree.
We can then follow the journey of U198, one of the subclades of U106.
This rather nice heat map shows the distribution of U198, which appears to be found almost exclusively in the British Isles and north-western France. It would be helpful to have the references that were used to compile the map. Perhaps that information will be added later.
For the mitochondrial DNA there is a description of the haplogroup, which in this case is haplogroup F, reflecting the participant's maternal ancestry from the Philippines.
There is a map showing the migratory path of haplogroup F. 
 There is also a heat map showing the places where haplogroup F is mostly found, though again it would be useful to have a list of the sources used.
Genographic results can be transferred free of charge to the Family Tree DNA database. CeCe Moore has blogged about her own results and has also included detailed instructions on the process of transferring results to FTDNA. We will no doubt learn much more as people test and contribute their results to research. Genographic results will be updated on a regular basis as more results are received and more reference populations are added to the database. For further information on the Genographic Project visit the Genographic website.

Websites

Tuesday 11 December 2012

23andMe test now down to $99

The personal genomics company 23andMe has just announced that it has reduced the cost of its test from $299 to $99. The new price has been made possible following the company's announcement today that it has raised more than $50 million of funding with the aim of helping them to achieve their growth goal of one million customers. The full press release can be read here.

Note that postage for the 23andMe kits in the US costs just $9.95 but is significantly more expensive in other countries as the kits are sent not by post but by courier although a prepaid return service is included in the cost. In some countries there are additional customs charges. Shipping to the UK costs $79.95. It costs $59.95 to send the kits to Canada, and $74.95 for Australia and New Zealand. I have not checked all the prices but I noticed that 23andMe charge $94.95 to ship to Cyprus, Malta and Iceland and $118.95 to send kits to Bosnia and Belarus. For a list of countries that 23andMe will ship to see this FAQ on their website.

Note that if you test with 23andMe you can also transfer your results to Family Tree DNA's Family Finder database for genealogical matches. Note however that the Y-DNA and mtDNA results from 23andMe are not included in the transfer as these results are not compatible with FTDNA's genealogical matching database. Although the 23andMe test includes a Relative Finder feature many of the people who test with 23andMe do so for health reasons and aren't interested in researching their family tree. Family Tree DNA also has a much more international database than 23andMe, largely thanks to its association with the Genographic Project. FTDNA will in theory send kits to any country in the world and charge a flat rate of just $6 for international postage. For information on the process of transferring kits to FTDNA please read the FAQs on Third-Party Transfers.

For further information on the 23andMe test read my four-part feature on "Exploring my genome with 23andMe":

Part 1 Disease risks
Part 2 Carrier status and drug responses
Part 3 Traits
Part 4 Ancestry

See also my blog post on 23andMe's new Ancestry Composition - a British perspective.

Saturday 8 December 2012

23andMe's new Ancestry Composition - a British perspective

23andMe's new Ancestry Painting feature, now known as Ancestry Composition, has just been launched. The old Ancestry Painting was only able to distinguish between three continental population groupings - European, Asian and African. I was a very boring and predictable 100% European.

Ancestry Composition provides a biogeographical analysis based on 22 reference populations. 23andMe have provided an excellent guide to the science behind Ancestry Composition which is well worth reading in order to get an understanding of how the analysis works. Ancestry Composition provides a number of different views showing your comparisons with global, regional and subregional populations at three different confidence thresholds - speculative (50%), standard (75%), and conservative (90%).

My documented ancestry is all from the British Isles. I know the names and birth places of 15 of my 16 great-great-grandparents and they are all English. In this generation I have one illegitimate line which has prevented from me finding out the name of the remaining ancestor. The birthplaces of these 15 great-great-grandparents are: Burrington, Devon; Bristol (2); Thornbury, Gloucestershire; Clapham, London; Colchester, Essex; Sandon, Hertfordshire; Limehouse, London; Bermondsey, London; Merriott, Somerset; Sydenham, Kent; Sydmonton, Hampshire; Kintbury, Berkshire; Westminster, London; Sherston, Wiltshire.

I know the names of 27 of my 32 great-great-great-grandparents, but I only know the birth places of 21 of these ancestors. All of my known ancestors are from the British Isles. These are the birth places where known: Ashreigney, Devon; Mariansleigh, Devon; Thornbury, Gloucestershire; Bristol; Great Yeldham, Essex; Preston, Hertfordshire; Sandon, Hertfordshire; Scotland (place not known); Hackney, London; Laverstoke, Hampshire; County Kerry, Ireland; Merriott, Somerset; Rickmansworth, Hertfordshire; Shoreditch, London; Ecchinswell, Hampshire; Welford, Berkshire; Kintbury, Berkshire; Salford, Bedfordshire; Holborn, London; Leighterton, Gloucestershire; Purton, Wiltshire.

Ancestry Composition gives me the following percentages:

Sub-regional Resolution
Standard Estimate
17.4% British and Irish
1.6% French and German
74.2% Nonspecific Northern European
0.1% Sardinian
0.2% Nonspecific Southern European
6.5% Nonspecific European
0.1% Unassigned

Conservative Estimate 
0.3% British and Irish
71.1% Nonspecific Northern European
0.1% Nonspecific Southern European
28.0% Nonspecific European
0.5% Unassigned

Speculative Estimate
56.7% British and Irish
10.7% French and German
0.1% Scandinavian
31.2% Nonspecific Northern European
0.3% Sardinian
0.5% Nonspecific Southern European
0.4% Nonspecific European

The Sardinian and Southern European percentages are undoubtedly false positives. It is not clear if the French and German admixture appears because of the difficulties in distinguishing between British, French and German populations or if this is a reflection of more distant admixture from the Normans and Saxons.

This screenshot shows the much improved Ancestry Composition with a view of my Speculative Estimate.

These are my percentages for the Regional and Global Resolutions:

Regional Resolution
Standard Estimate
93.2 % Northern European
0.2% Southern European
6.5% Nonspecific European
0.1% Unassigned

Conservative Estimate
71.4% Northern European
0.1% Southern European
28% Nonspecific European
0.5% Unassigned

Speculative Estimate
98.8% Northern European
0.9% Southern European
0.4% Nonspecific European

Global Resolution
Conservative Estimate
99.5% European
0.5% Unassigned

Standard Estimate
99.9% European
0.1% Unassigned

Speculative Estimate
100% European

Although the subregional representations do not assign me with as much British ancestry as might be expected it is worth bearing in mind that these analyses are still in their infancy. 23andMe explain in their Ancestry Composition guide that their reference populations are largely drawn from their customer base and are supplemented from public reference datasets such as the Human Genome Diversity Project, HapMap, and the 1000 Genomes project.1 However, only a small number of genomes are as yet available in the public datasets. The 23andMe customers who are included in the reference dataset are required to have four grandparents born in the same non-colonial country. Although 23andMe were reported to have 180,000 paying customers in their database as of 9th October 2012, their customers are mostly Americans of mixed ancestry, few of whom will meet the qualifying criteria.2 Not all of the 23andMe customers will in any case have filled out the ancestry questionnaire. With the combination of 23andMe customers and public datasets there are just 7,868 people in the reference dataset used for Ancestry Composition. As all four of my grandparents were born in the UK I presume my own results have been included in this reference dataset.  I think it is a shame that 23andMe's questionnaire does not split up the United Kingdom into the four constituent countries as it would be more interesting to see if differences could be found between England, Scotland, Wales and Northern Ireland, rather than lumping all four very different countries together.

23andMe very helpfully provide details of the reference populations that they have used in their analysis. Below are screenshots showing the figures for the reference populations which appear in my Speculative Estimate.





As can be seen, the numbers are very small, but 23andMe have designed the Ancestry Composition tool in such a way that the results can be updated on a regular basis as and when more populations are added to the reference databases so no doubt the accuracy of the predictions will improve over time. For those of us from the British Isles we can probably expect to see big improvements when the datasets from the People of the British Isles Project become available. This project has tested over 4,500 people from the UK.3 To be eligible for the project people must have not just four grandparents from the same country but four grandparents from the same rural county. It might, therefore, one day be possible to assign percentages of DNA to specific English counties or regions.

"British/Irish" DNA seems to have been a particular problem with Ancestry Composition. Although the tool has a very high accuracy rate for the DNA which is assigned as British and Irish in their validation tests (a "precision" level of 0.90), they are much less successful at identifying all British/Irish DNA as British/Irish. The technical term for this is the recall rate. The recall rate for British and Irish DNA in the 23andMe validation tests was 0.32%, meaning that 68% of British and Irish DNA will not be picked up.1 The recall rate will no doubt improve as more reference samples are added to the database. However, it is difficult to quantify British or Irish DNA because we are an admixed population, comprising a mixture of DNA from many different groups such as the Saxons, Celts, Vikings, Picts, Normans, Bretons and Romans.

Chromosome view
As well as the map view there are two alternative views: split view and chromosome view. To use the split view it is necessary to have one parent in the 23andMe database. As my parents have not tested with 23andMe I cannot make use of this feature. I can, however, access the chromosome view which provides an interesting breakdown of the various percentages on the individual chromosomes. The screen shot below shows my Speculative Estimate.

With so many similar shades of blue it's quite difficult to distinguish the individual populations that make up each chromosome, though you can hover over a specific population to get a clearer picture. The screenshot below picks out the chromosomes where 23andMe speculates that I match with French and German populations.

As can be seen, whole chromosomes seem to have been matched with French and German populations which I don't quite understand. I don't have any French or German ancestry within the last several hundred years, though at the population level all British people would be expected to share many markers in common with the French and the Germans, but after several hundred years have passed I would have thought that there would only be tiny segments of "French" and "German" scattered throughout my genome.

Neanderthal DNA
In addition to Ancestry Composition another interesting and fun feature of the 23andMe test is that it will give you your percentage of Neanderthal admixture. This feature was introduced in December 2011.4 23andMe estimate that 2.5% of my DNA is inherited from Neanderthals.

Neanderthal percentages are also provided by the new Geno 2.0 test from the Genographic Project, which also provides percentages of Denisovan DNA. I imagine that 23andMe will eventually update their test to provide Denisovan percentages.

Conclusion
Ancestry Composition is a great improvement on 23andMe's Ancestry Painting. The percentages seem to be much more accurate than those provided by AncestryDNA. 23andMe also benefits by providing technical information on the methodology used by the scientists and they also provide valuable details of the reference populations used for the analysis, features which are notably absent at AncestryDNA. Family Tree DNA's Family Finder test includes a tool known as Population Finder. An update to Population Finder is expected in the New Year and it will be interesting to see how this compares with Ancestry Composition.

Other blog posts on Ancestry Composition
A number of other bloggers have written about their experiences with Ancestry Composition or provided commentary. Here is a list of the posts I have found to date. I will update the list as and when new posts are discovered:
- 23andMe's new Ancestry Painting - first look! by CeCe Moore. This post includes screenshots showing statistics on all the reference populations used by the Ancestry Composition tool.
- 23andMe Ancestry Composition Examples Part 1 by Andrea Badger. This post includes a magnificent selection of screenshots from people with a variety of mixed heritage producing a wonderful rainbow of colours.
- New worldview at 23andMe by Roberta Estes.
- My Ancestry Composition from 23andMe by Aidan Byrne.
- 23andMe Ancestry Composition by Dienekes Pontikos.
- Admixture advances by Judy Russell
- 23andMe adds ancestry composition by John Reid
- Is Daniel MacArthur 'desi' by Razib Khan

References
1. Ancestry Composition: 23andMe's State-of-the Art Geographic Ancestry Analysis. Anonymous article on the 23andMe website. Accessed 8th December 2012.
2.  How many paying customers does 23andMe have? Answer provided on Quora.com website by 23andMe software developer Alex Kohmenko on 9th October 2012.
3. The website of the People of the British Isles Project Project keeps track of the collection progress. As of 8th December 2012 it was reported that 4,538 samples had been collected.
4. Find your inner Neanderthal. 23andMe blog post by Scott H, 15th December 2011.

See also
My four part feature on "Exploring my genome with 23andMe":
Part 1 Disease risks
Part 2 Carrier status and drug responses
Part 3 Traits
Part 4 Ancestry

© 2012 Debbie Kennett