Wednesday, 25 January 2017

"Making Sense of Forensic Genetics" - a new guide from Sense About Science

We find in genetic genealogy that there are many misconceptions about the use of DNA, and people often tend to give undue weight to DNA evidence. In order to draw a conclusion it is necessary to look at all the available evidence in combination, rather than a single piece of evidence in isolation. DNA evidence on its own is not very informative. It's also very important that the DNA evidence is interpreted correctly.

These principles are even more important in a crime investigation where a person's innocence or guilt is at stake. DNA evidence can be a game changer but there are also cases where its misuse has led to miscarriages of justice. Misconceptions are fuelled by the misrepresentation of DNA analysis in the media and in popular crime programmes such as Silent Witness, CSI and Waking the Dead. There are also some police departments which are using forensic DNA technology that has not been scientifically validated.

Making Sense of Forensic Genetics is a welcome and much-needed new guide, published by Sense About Science, which sets out to explain how DNA is used in the criminal justice system and to educate the public and professionals alike on the correct application of forensic genetics in criminal investigations. The guide also includes some helpful real-life case studies to illustrate how DNA evidence works in practice. As the authors say: "DNA needs to be viewed within a framework of other evidence. It’s an important detection tool, but it’s certainly not a detective".

The guide has been produced by EUROFORGEN, a European network of forensic DNA researchers, working in collaboration with Sense About Science. Many people from related disciplines were also involved in the development of the booklet including police, barristers, judges, legal charities and crime fiction writers. I was invited to a user feedback workshop as a representative of the genetic genealogy community and provided feedback on the drafts of the booklet so I can testify at first hand to the extensive consultation process involved.

The booklet clarified a lot of the questions that I had about the use of forensic genetics and the interpretation of DNA evidence in court. I highly recommend reading it. You can download a copy here.

Media coverage and further reading
Dr Denise Syndercombe Court, reader in forensic genetics at Kings College London, and one of the researchers involved in the writing of the booklet, spoke about the guide on the BBC Radio 4 Today programme this morning. Listen here from 50:25.

The guide was featured on BBC Radio 4 Inside Science programme with Adam Rutherford on 27th January.

A piece about the guide will shortly be up on The Conversation (I'll update this blog post when the link goes live),

Peter Gill, one of the authors of the guide, has written an article for The Justice Gap on How misuse of DNA evidence has led to miscarriages of justice.

Thursday, 19 January 2017

My Living DNA results Part 2: mtDNA and Y-DNA reports

In my previous post I wrote about my family ancestry maps from Living DNA which showed the regional breakdown of my genetic ancestry in Britain based on an analysis of my autosomal DNA. I'm now reviewing the mtDNA and Y-DNA reports, which have started to be rolled out to some of the early testers at Living DNA.

The mtDNA report is a provisional report based on my own Living DNA test on the Illumina Global Screening Array. Males who take the Living DNA test also receive a report on their Y-chromosome results. As I don't have a Y-chromosome, for the purposes of this blog post I've been given access to a sample report for someone who belongs to haplogroup R1b-U106 (my father's haplogroup).

mtDNA results
The Living DNA test analyses 4,700 mtDNA SNPs. My results show that I belong to haplogroup U4c1. I have also had my full mitochondrial DNA genome sequenced (all 16569 base pairs) at Family Tree DNA. My full sequence results place me in haplogroup U4c1a. For my motherline ancestry I have received four different reports: a coverage map, a history page, a migration map and a phylogenetic tree.

Here is the coverage map which shows the present-day distribution of haplogroup U4, and the frequency of haplogroup U4 in different populations.


Here is the history page which provides background information on haplogroup U4.


Here is the phylogenetic tree which shows my placement on the mtDNA tree.


Here is the migration map which "shows the possible routes your ancient ancestors could have taken, from the point we all shared the same mtDNA (nicknamed “Eve”) to recent times".



Y-DNA results
The microarray chip used by Living DNA covers 22,500 Y-SNPs. The Y-SNPs are currently going through the quality control process, and it will be a few more weeks before the results are ready. Not all SNPs work properly on a microarray chip and it is likely that the actual number of SNPs reported will be reduced. The sample report I've been given is for haplogroup R1b-U106 (my father's haplogroup). U106 is quite high up on the Y-SNP tree but I understand that the test will give a more refined subclade assignment than this, though we don't yet know which SNPs are included on the chip. There are once again four different reports: a coverage map, a history page, a migration map and a phylogenetic tree.

Here is the U106 coverage map.


Here is the history page. I'm told that this is legacy information from the old test site, but that the content will eventually be updated and will be based on the scientific literature.


Here is a screenshot showing the upper branches of the Y-chromosome phylogenetic tree.


You can move the tree around and zoom in on the tree. Here is a close up showing the placement of U106 on the R1b tree.


Here is the migration map which "shows the possible routes your ancient ancestors could have taken, from the point all men shared the same YDNA (nicknamed "Adam") to recent times".


Conclusion
The mtDNA and Y-DNA pages are visually appealing and I like the simplicity of the presentation. The phylogenetic trees are easy to understand. The distribution maps and frequency tables are a very useful feature. It would be helpful to have the full citations with links to the actual papers, though I understand that these will be added in due course. At present the Y-DNA and mtDNA SNPs are not reported but these will also be added.

While it's good to see scientific papers used for the history pages it should be remembered that it's very difficult to provide meaningful information on haplogroup histories and migration. Much of the scientific literature on the subject is highly speculative with conclusions inappropriately drawn about ancient migrations and origins from modern DNA (Balloux 2009, Chikhi 2010, Goldstein and Chikhi 2002). The DNA of living people is not a good proxy for past populations, and direct evidence from ancient DNA is required (Pickrell and Reich 2014). However, we can expect to see many new ancient DNA publications in the coming years which will improve our understanding. I would hope that the Living DNA platform will have the ability to update the reports from time to time as and when new research is published.

The Y-DNA and mtDNA results from the Living DNA test are potentially useful for deep ancestry purposes but don't currently have a direct application for genealogical research. Y-DNA and mtDNA testing for genealogy needs to be done with a company such as Family Tree DNA which has a matching database that allows you to compare your results with other people. However, if you've already taken a Y-STR test at FTDNA and wish to refine your subclade assignment the Living DNA test could be a possible alternative to SNP testing or Y-chromosome sequencing. For SNP discovery and a detailed subclade classification it's necessary to take a Y-chromosome sequencing test (eg, the BigY from Family Tree DNA or the YElite from Full Genomes Corporation) but these tests are still relatively expensive and beyond the reach of the average genealogist.

I suspect genetic genealogists will be taking the Living DNA test primarily for the autosomal DNA family ancestry maps, but the Y-DNA and mtDNA information will be a useful bonus feature. Not everyone is interested in genealogical research and for people who just want an overview of their genetic ancestry then this is an excellent all-round test.

References

Balloux F (2009). The worm in the fruit of the mitochondrial DNA tree. Heredity 104: 419-420.

Chikhi L (2010). Update to Chikhi et al.'s "Clinal Variation in the Nuclear DNA of Europeans” (1998): Genetic Data and Storytelling - From Archaeogenetics to Astrologenetics?" Human Biology 81(5/6): 639-643.

Goldstein DB, Chikhi L (2002). Human migrations and population structure: what we know and why it matters. Annual Review of Genomics and Human Genetics 3: 129-152.

Pickrell J, Reich D (2014). Towards a new history and geography of human genes informed by ancient DNA. Trends in Genetics 2014; 30 (9): 377-389 (subscription required).

Wednesday, 11 January 2017

My Living DNA results Part 1: family ancestry maps

I've now received my long-awaited results from my Living DNA test. This test presents an overview of an individual's genetic ancestry across 80 world regions, including 21 regions from the British Isles. The methods behind the Living DNA biogeographical ancestry tools were used in the landmark People of the British Isles Project (POBI), and have since been applied to numerous populations around the world. It is the first test that allows people to see how their results compare with the POBI dataset. (See my article from September 2016 for background information about this test.)

I have received a complimentary test from Living DNA, and the results I am reporting here are based on testing done on the new Illumina Global Screening Array. In this article I will focus on the family ancestry maps. I will write about other aspects of the test, including the Y-DNA and mtDNA reports, in a separate blog post.

Only a handful of people have received their results so far, and I'm privileged to be one of the first people to receive results from this new test. Further results are slowly being rolled out but most people won't get their results until early or mid-February.

The Living DNA platform allows you to view results at different levels: a global, regional and sub-regional level. The results are reported in three different modes;
  • Cautious - Similar populations are grouped together "in order to provide the highest certainty of results possible".
  • Standard - The company highlights "the sources of your ancestry estimate that they are most certain about. Ancestry that cannot be attributed to one of these sources is shown as being "unassigned"." 
  • Complete - This provides their "best estimate of your overall genetic makeup". 
Currently only the standard mode is available. The complete and cautious modes are due to be released in late February 2017.

The family ancestry maps show me the areas of the world where I share my genetic ancestry in recent times (4-5 generations).  Update 17th February: The time frame has now been updated and my results page now states: "Your family ancestry map shows the areas of the world where you share genetic ancestry in recent times (10 generations)".

Here is my map at the global level which shows that 98.4% of my ancestry is from Europe with 1.6% unassigned.


By clicking on the + symbol I can access the submenu to view my genetic ancestry at the regional and subregional levels.

Here are my regional results in standard mode which place 98.4% of my ancestry in the British Isles with 1.6% unassigned. In my case the global and regional maps are identical because all my recent ancestry is from the same place.


Here are my subregional results in standard mode:

(I've adjusted the contrast on the above map to make the regions stand out more clearly.)

Here are my subregional results in standard mode in a tabular format:


Debbie's results
Lincolnshire
17.5%
South Central England
17%
Devon
16.2%
Southeast England
10.6%
Northumbria
9.1%
East Anglia
8.4%
Cornwall
4.6%
South Wales Border
3.9%
Aberdeenshire
3.3%
Orkney
1.8%
North Yorkshire
1.7%
British Isles (unassigned)
4.3%


World (unassigned)
1.6%


I have also been given an advance preview of my subregional results in cautious mode. These results are subject to change:


Debbie's results
Southeast England
36%
South West England
20%
South Central England
19%
North Central England
14%
North East Scottish
3%


Unassigned British Isles
7%
Unassigned European
1%

This is a full list of the 21 regional groups within Britain and Ireland identified by Living DNA: Aberdeenshire, Central England, Cornwall, Cumbria, Devon, East Anglia, Ireland, Lincolnshire, North Wales, North Yorkshire, Northwest England, Northwest Scotland, Northumbria, Orkney, South Central England, Southeast England, South England, Southwest Scotland and Northern Ireland, South Wales Border, South Wales, South Yorkshire. Descriptions of the regions can be found here.

Comparing with known genealogical ancestry
So how do these results compare to my known genealogical ancestry? I've provided a chart below showing the place of birth of all my ancestors going back for five generations. (The chart is inspired by the #Mycolorfulancestry meme started by J. Paul Hawthorne over at the Geneaspy blog.)
As can be seen, the birthplaces of seven of my 32 great-great-grandparents are unknown. I have four great-great-great-grandparents who were born in London, and two who were possibly born in Bristol. People mostly migrated to Bristol from the West Country and Wales, but ancestors who lived in London could have come from anywhere in the country. According to one estimate "one in every six English people either visited or lived in London at some stage in their lives during the early modern period" and London was also the major destination for immigrants from continental Europe (Hey 2004). Identifying the origins of ancestors in big cities like London and Bristol is often difficult unless they happen to have survived until 1851 so that their birthplace is recorded in the census.

My genealogy is probably not ideal to be used for a comparison because I have so many unknown and London ancestors. However, I was very pleased to see that my Devon ancestry was reflected in my results. The average amount of DNA inherited from a great-great-grandparent is 6.25%. My Devon ancestry has come out at 16.2% which is higher than expected, but it could be that some of my unknown ancestors are from Devon. This high percentage might also reflect my Somerset ancestry. Somerset is not identified as a specific region with the Living DNA test.

I was intrigued by the 1.7% from South Yorkshire. One of my London ancestors, my great-great-great-grandmother Elizabeth Horton, was born c.1806 in Hackney. I have some matches at 23andMe with a couple of people with the surname Horton in their family trees who have ancestry from Yorkshire. I've since discovered that Horton is a Yorkshire surname. This is a case where DNA testing has the potential to be used to inform genealogical research.

I don't have any documented ancestry from either Wales or Cornwall but these percentages could be explained by my Bristol ancestry. It's also important to remember that ancestry is not strictly defined by country or county borders. For example, during the nineteenth century many people from Devon moved to South Wales to work in the coal and copper mines so you could have people with four grandparents born in Wales who actually have a lot of ancestry from the South West. Conversely there are a lot of Welsh patronymic surnames that are found in North Devon.

I do not have any ancestry assigned to Ireland with the Living DNA test, whereas at AncestryDNA I came out with a surprisingly high percentage of Irish ancestry (20%). Living DNA are currently recruiting people with four grandparents born in Ireland to improve the Irish estimations so it may be that my results will be updated in the future and my Irish ancestry will be reflected in my results.

The big surprise is the high percentage (17.5%) assigned to Lincolnshire though I'm told that the region labelled as Lincolnshire also extends into North London. The Lincolnshire component could possibly be a reflection of my ancestry from Bedfordshire and Hertfordshire, two counties to the north of London that are en route to Lincolnshire. Alternatively it might perhaps be a clue to the origins of some of my unknown ancestors or it could represent deeper ancestry. It's very easy to speculate but with any DNA test it's important not to take the results too literally and to interpret the results in combination with the genealogical evidence.

The rest of my results broadly correlate with my known genealogical ancestry, though I have come out with more ancestry from the north of England than I expected.

Although I've only focused on my own results, Living DNA can also provide regional breakdowns for other countries, though not currently with such fine-scale resolution as for the British Isles. For example, in one set of results I've seen for a British individual with some Italian ancestry he was assigned ancestry from Southern Italy and Sardinia. Jewish and Aboriginal ancestry is not currently covered by Living DNA but should be added in the future. We can expect to see the results change over time as more studies are published and more reference populations are added to the database.

A comparison with my results from other companies.
I also have biogeographical ancestry results from 23andMe, AncestryDNA, Family Tree DNA and GPS Origins.

23andMe aims to provide a representation of our genetic ancestry within the last 500 years. My speculative results at the subregional level match 56.1% of my ancestry with Britain and Ireland.

At Family Tree DNA the MyOrigins results from my Family Finder test match 57% of my DNA with the British Isles. FTDNA do not give a timeframe for their results.

At AncestryDNA 21% of my DNA is matched with Great Britain and 20% is matched with Ireland. AncestryDNA do not claim to show genetic ancestry within the last 500 years and state that the results can "reach back hundreds, maybe even a thousand years, to tell you things that aren't in historical records". The Irish component of the AncestryDNA test seems to be elevated in people of British ancestry, and the Irish cluster actually extends into Wales and Scotland. Most English people are coming out with at least 15% "Irish" at AncestryDNA. For further details see the AncestryDNA blog post What does our DNA tell us about being Irish?.

For a detailed comparison of my results from 23andMe, Family Tree DNA and AncestryDNA see my blog post Comparing admixture results from AncestryDNA, 23andMe and Family Tree DNA.

The GPS Origins test does not have any British or Irish components and instead matches 19% of my ancestry with Fennoscandinavia, 12.9% with Western Siberia, 12.4% with Sardinia, 11.8% from Orkney, 11.3% with Southern France and 11.2% with the Basque Country, with smaller percentages from other regions. For further details see my blog post A review of the GPS Origins tests: four ethnicities and four reports.

Conclusion
The Living DNA test has given me the most accurate results of all the biogeographical ancestry tests I've taken so far, placing 98.4% of my ancestry in the British Isles which is a true reflection of my documented ancestry within the last few hundred years.

For the first time I have been given a subregional breakdown of my results which shows the regions within Britain where my ancestors possibly lived in the last few hundred years. Because I have lots of ancestors who went through the melting pot of London and Bristol my British ancestry is very mixed as people migrated to these cities from across the country. This also makes pre-1837 research in big cities before the beginning of civil registration in England and Wales very challenging. I also have some gaps in my family tree because of illegitimacies. However, my Living DNA results correlate very well with what I have been able to document about my ancestry, and might even have provided me with some clues to break through some of my brick walls. I was particularly pleased to see that my Devon ancestry had been picked up and was intrigued to see the small traces of my possible Yorkshire ancestry.

My results only provide a sample size of one and it is very easy to interpret results post hoc and make them fit with your expectations so I shall be very interested to see results from other people, and especially people who have more defined ancestry than me from specific regions.

Biogeographical ancestry testing has entered an exciting new era and we now have a test that provides a fine-scale breakdown of our genetic ancestry at a subregional level.

Update
Living DNA updated their algorithms in June 2017 and released the family views feature providing cautious, standard and complete views. See my article My updated family ancestry maps from Living DNA.

See also
Other reviews
Reference
Hey D (2004). Journeys in Family History. London: The National Archives, p.183.

© 2017 Debbie Kennett

AncestryDNA reaches the three million milestone


AncestryDNA have announced that they have reached the three million milestone. It took 11 months to go from 1 million to 2 million customers but just seven months to get to 3 million.

Astonishingly 1.4 million kits were sold in the last three months of 2016 with 560,000 kits sold over the so-called "Black Friday" weekend when the AncestryDNA test was on sale for just £49 in the UK and $49 in the US.