Friday, 22 March 2013

AncestryDNA updates

I wrote previously of my experiences with the AncestryDNA autosomal DNA test. I covered the consent forms and the admixture analyses in my first article. In a second article I looked at the matching process. One of the big criticisms that I and many others had with regard to the AncestryDNA test was the fact that the company, unlike 23andMe and Family Tree DNA, did not allow customers access to their raw genetic data. I am now pleased to advise that Ancestry have listened to the feedback and have finally made the raw data accessible.

The data can be accessed by logging into your AncestryDNA account and clicking on "Manage Test Settings". Before downloading the data it is necessary to re-enter your password. There are notices to advise that the downloaded data is subject to the AncestryDNA Terms and Conditions and the AncestryDNA Privacy Statement. The Terms and Conditions were revised on 20th March 2013 and now include what appears to be a new section laying out the Rules of Conduct. These state among other things that "You must also agree that you will provide valid and complete contact information, and that you will always have a valid email address on file with AncestryDNA." In addition the rules include the following somewhat puzzling condition: "You must not use the information from the AncestryDNA website or DNA tests (including any downloaded raw DNA data) in whole, in part and/or in combination with any other database for any discriminatory, breach of privacy or otherwise illegal activity (for example, to re-identify any anonymous donor or to make insurance or employment decisions)."

The Rules of Conduct conclude with the following paragraph: "These Rules of Conduct are not exclusive. If we believe, in our sole discretion, that you are in breach of this Agreement, are acting inconsistently with the letter or spirit of this Agreement or otherwise interfering with the efficient management or delivery of the AncestryDNA Website, Service or Content, we may limit, suspend or terminate your access to our AncestryDNA Website. In such a case, no portion of your subscription payment will be refunded. Should we decide to suspend or terminate your access for any reason other than your actions or omissions which we believe to be inconsistent with this Agreement we will refund to you any unused portion of your payment, which will be your sole and exclusive remedy upon such a suspension."

I am not at all clear how someone can use their own genetic data in any type of illegal activity and it seems to me that it is entirely my business what I do with my own genetic data and has nothing to do with AncestryDNA at all. The requirement to maintain a valid e-mail address is of some concern as this rather suggests that any account that does not have a valid e-mail address will be excluded from the AncestryDNA database. Inevitably subscriptions will lapse over time. People become ill and are no longer able to continue their family history research or they die and their account is not passed on to a relative. Does this mean that all these results will be removed from AncestryDNA because the accounts no longer comply with the Rules of Conduct?

The AncestryDNA Privacy Statement has been similarly updated with effect from 20th March 2013. Interestingly I note that Ancestry have now signed up to the Safe Harbor program which relates to the "collection, use and retention of personal data from European Union member countries and Switzerland". Does this mean that Ancestry are gearing up to make their test available in Europe? In section 3 "How does AncestryDNA use your personal information?" there is what appears to be a new addition and by testing with Ancestry you are now giving them permission to use "your personal information" to "research human genetic diversity". From what I can gather this permission applies even if you have, like me, not signed the separate AncestryDNA Consent Form.

Having gone through the instructions on the AncestryDNA website you are sent an e-mail to confirm the download. The e-mail is reproduced below.
Having confirmed the data download you are taken back to the AncestryDNA website and taken to a page where you can download the raw data. I have provided a screenshot below.
The data is downloaded as a zip file and when the file is unzipped it opens up in Notepad. The many citizen scientists in the genetic genealogy world are currently trying to examine and make sense of the raw data. It is likely that third party websites such as Gedmatch will provide a facility to upload AncestryDNA data. Support will no doubt also be provided for the other third-party tools which are listed on the autosomal DNA tools page in the ISOGG Wiki.

Interestingly, although Ancestry do not provide information on the X-chromosome and Y-chromosome SNPs on their chip or use these results for matching purposes, the raw data is included in the download file so by downloading the data it will be possible to get more value out of the test. It is not yet known which Y-SNPs are included on the chip but this information could potentially be of great value for anyone who has taken a Y-STR test and who wishes to learn more about their deep ancestry by participating in a Y-DNA haplogroup project.

Another big announcement about the AncestryDNA testing service was made today at Roostech by Tim Sullivan, Ancestry's President and Chief Executive Officer. He advised that the the Ancestry DNA is now available at the new low price of $99 to both subscribers and non-subscribers. The test was originally offered at $99 in the beta-testing period. The price was subsequently raised to $199 for non-subscribers and $129 for subscribers. The latest reduction means that the AncestryDNA test is now the same price as the 23andMe test. However, the 23andMe test provides many additional features, including health and trait information, which are not available from Ancestry.  Tim Sullivan also announced that Ancestry have over 120,000 autosomal results in their database. He promised that improved ethnicity results and improved cousin matches are on the way but no specifics were given.

Note that the AncestryDNA test is only available to US residents. Although I live in the UK, for some reason I was able to order the AncestryDNA test during the beta-testing phase, but I am one of only a tiny handful of non-US people in their database at present. It is not yet known when or if Ancestry will make their test available in other countries. For those of us who do not live in the US there is a straightforward choice between 23andMe and FTDNA's Family Finder test. Currently the most cost-effective way to get your results in both databases is to test with 23andMe and then transfer your results to FTDNA.

Family Tree DNA's Family Finder test is now much more expensive at $289 than the comparative offerings from 23andMe and AncestryDNA. The US is the prime market for all three companies. It will, therefore, be interesting to see how FTDNA respond to the competition. At the very least, it would be very useful if FTDNA could follow Ancestry's example and allow their customers access to their raw Y-SNP data. In theory FTDNA should be able to add AncestryDNA to their third-party transfer program, but the transfer currently costs $89, which is only $10 short of the cost of the 23andMe and AncestryDNA tests. Will FTDNA reduce the cost of the transfer to encourage more people to transfer their results and to widen their database? Whatever happens the competition will be very beneficial for the genetic genealogy community and we can no doubt look forward to many exciting developments in the next few years.

*Update 23rd March 2013*
I've now transferred my raw data from AncestryDNA into a spreadsheet. The file header contains the following information:
AncestryDNA raw data download
This file was generated by AncestryDNA at: 03/22/2013 10:39:55 MDT
Data was collected using AncestryDNA array version: V1.0
Data is formatted using AncestryDNA converter version: V1.0
Below is a text version of your DNA file from Ancestry.com DNA, LLC.  THIS INFORMATION IS FOR YOUR PERSONAL USE AND IS INTENDED FOR GENEALOGICAL RESEARCH ONLY. IT IS NOT INTENDED FOR MEDICAL OR HEALTH PURPOSES. THE EXPORTED DATA IS SUBJECT TO THE AncestryDNA TERMS AND CONDITIONS, BUT PLEASE BE AWARE THAT THE DOWNLOADED DATA WILL NO LONGER BE PROTECTED BY OUR SECURITY MEASURES. 
Genetic data is provided below as five TAB delimited columns. Each line corresponds to a SNP. Column one provides the SNP identifier (rsID where possible). Columns two and three contain the chromosome and basepair position of the SNP using human reference build 37.1 coordinates. Columns four and five contain the two alleles observed at this SNP (genotype). The genotype is reported on the forward (+) strand with respect to the human reference.
My AncestryDNA raw date file contains information on 701,478 SNPs divided into 25 chromosomes. I have data for:

- 17604 SNPs on chromosome 23
- 885 SNPs on chromosome 24
- 440 SNPs on chromosome 25

We of course only have 23 pairs of chromosomes. Ann Turner has clarified on the Genealogy DNA list that chromosome 23 is the X chromosome, chromosome 24 is the Y-chromosome, and chromosome 25 "is for XY SNPs, where the SNP is also found on the pseudo-autosomal regions (PAR) at the tips of the Y". As a female I do not have a Y-chromosome and most of my results for the Y are no calls (zeros). However, I do have results reported for 93 Y-SNPs. Apparently this is something to be expected for reasons which are not yet clear to me.

CeCe Moore was one of a number of genetic genealogists who had a meeting with the AncestryDNA people at Rootstech and she has advised on the Genealogy DNA list that Ancestry are working on a search function filtered by surname or user name. She further advises that Family Tree DNA are hoping to accept AncestryDNA uploads from the beginning of May and that Gedmatch will be able to accept AncestryDNA uploads in a couple of weeks.

*Update 24th March*
AncestryDNA have now added a section on raw data downloads to their FAQs which can be read here. Ancestry seem to be overly concerned about their customers misusing their data in some unforeseen way and provide a number of cautionary warnings about using your data on third-party websites.

*Update 25th March*
For a detailed report on AncestryDNA's plans see CeCe Moore's blog post on "AncestryDNA, Raw Data and Rootstech".

© 2013 Debbie Kennett

6 comments:

Brian Swann said...

As you said, Debbie, until Ancestry advertises this product in Europe I can only get so excited about it.

Angela Crouch was not at WDYTYA this year, so I start to wonder if she is still at Ancestry UK. She was their DNA person.

It still seems to me the POBI Project is streets ahead of anyone in terms of what folk in the UK really want, or am I missing something?

Debbie Kennett said...

I don't know what the staffing situation is at Ancestry. I can't get too excited about Ancestry either. An all-American database is not of much help to anyone here unless they are specifically trying to find cousins in America. I don't like some of their terms and conditions, and I don't like the idea of being locked into a subscription for full access. I'm looking forward to the next paper from the People of the British Isles Project but I still wonder how the POBI results can actually be translated into something meaningful as part of a DNA test. It might not be so easy to detect Devon or Cornish DNA at the individual level.

Jennifer K said...

I have a second cousin match at AncestryDNA, and we're desperately trying to figure out how we're related. Both of my parents (and I) have tested at 23andme and FamilyTreeDNA. I have raw data from both companies for all accounts.

Is there any way I can take the AncestryDNA raw data from my predicted second cousin to compare to my parents' raw data, aside from Gedmatch.com (their servers are down currently)? Anything I can do with Excel spreadsheets to compare and determine which of my parents are related to this second cousin?

Debbie Kennett said...

Hi Jennifer

There are various tools that you can use. There is a list on the ISOGG Wiki:

http://www.isogg.org/wiki/Autosomal_DNA_tools

As you've tested both your parents you will be able to phase your SNPs so that you can work out whether your matches are on your mum's or your dad's side of the family. The advanced users are doing something called chromosome mapping but this also involves testing other family members to identify where a segment originated. I've not yet started to use these techniques as I'm still waiting for meaningful matches on my side of the Atlantic! If you need any help I suggest you ask the experts on the ISOGG list:

http://groups.yahoo.com/group/DNA-NEWBIE/

You need to join ISOGG first but membership is free!

Robin Ward said...

Guys, many of us have made contact with non-us residents that are discovered as part of our own research. You, as a US resident, can order as many AncestryDNA kits as you have non-us family members and ship them to them as part of a "cross the pond" family study. It takes a bit of effort to organize this and you could end up in serious credit card debt if your non-us family don't pay you back, but it's doable. Results are administered by the US purchaser and shared with the test takers. Phone calls must be made to allow access to each others results, but this is not that big a deal. Warning. Not all DNA support staff are up on these "loopholes", and it will take 5-10 days for results to be made available to each other, but it is a way around the current restrictions. Now that raw DNA data is available to the "owner" you can find even more ways to develop a family group study. I know I can't be the only person who has been doing this as evidenced by the number of test results that are administered by someone other than the DNA donor.

Debbie Kennett said...

Robin, That's fine if you're an American wanting to investigate connections with known relations outside the US. However, the test is of no use whatsoever to non-Americans unless they are specifically looking for matches with close cousins in America. If Ancestry do ever decide to open up the test to people in other countries people the fact that the database has grown in such a lopsided way will be a severe disincentive for people to take the test. As one of the few Brits in the database I'm overwhelmed with matches with fifth to distant cousins in America that I can do nothing with. It would be like trying to find a needle in a haystack searching for the few meaningful matches with fellow Brits amongst this lot.