Monday, 21 May 2018

Updates to the Terms of Service and Privacy Policy at GEDmatch

In the light of the revelations that the citizen science website GEDmatch was being used by law enforcement to identify victims of crime and suspects in murder investigations, the site owners have now updated the Terms of Service and Privacy Policy. When you log into your GEDmatch account you will now receive the following message:
The new Terms of Service and Privacy Policy can be seen here. (You do not need to be a GEDmatch user to access the link.)

It is of note that GEDmatch have now clarified that "DNA obtained and authorized by law enforcement" can be uploaded for two very specific uses to: "(1) identify a perpetrator of a violent crime against another individual; or (2) identify remains of a deceased individual".

There is also a new section which explicitly spells out the potential uses of your DNA results including the fact that your DNA could be used for "Familial searching by third parties such as law enforcement agencies to identify the perpetrator of a crime, or to identify remains."

The user is presented with three options: (1) to accept the new terms and conditions; (2) to reject the new policy and delete their account; (3) to decide later.

Preliminary thoughts
When the news first broke that GEDmatch had been used to identify a murder victim known as the Buckskin Girl, I expressed concerns about the use of GEDmatch for this purpose without the explicit informed consent of the users. The use of GEDmatch in the identification of a suspect in the Golden State Killer case exacerbated those concerns. I am therefore very pleased that GEDmatch have taken prompt action to update their site policy. The revised policy is also a commendable example of transparency, and a welcome use of plain, simple and direct language.

If you'd asked me two months ago what would happen if it was revealed that GEDmatch had been used by the police in a murder investigation I would have predicted that large numbers of people would have withdrawn their data and that GEDmatch would have been pressurised to restrict access to law enforcement. I couldn't have been more wrong. While views have been mixed there has been a positive reaction from many members of the genetic genealogy community who are happy that their DNA has potentially been used to catch a killer.

GEDmatch have made a bold and brave move in legitimising the use of their site for law enforcement searches in specific circumstances. I think they are being genuinely altruistic and want to have GEDmatch used to bring closure to the affected families. They are to be commended for this decision.

However, we now have a very interesting situation. GEDmatch is a citizen science website that was initially set up to provide DNA and genealogy tools to help adoptees who were searching for their biological parents. At present all the police DNA databases use autosomal STR (short tandem repeat) markers, and up to 24 such markers are currently used. Although the number of markers used is very small, they are specially chosen for their variability, and there are very low odds that two people would have an identical DNA fingerprint. Autosomal STRs can be used for familial searches but are only effective for identifying very close relationships. The standard tests used for genetic genealogy use a different type of marker known as a SNP (single nucleotide polymorphism). Upwards of 600,000 autosomal SNPs are tested on a microarray chip. Results can be compared in a database using the amount of DNA shared and the length of the shared segments to make predictions about relationships. Predictions can be made with reasonable confidence in combination with genealogical records for relationships up to about the second cousin level. Predictions are more difficult for more distant relationships because of the random nature of DNA inheritance and the limitations of family tree research. As far as I'm aware, there is no police force in the world which has its own autosomal SNP database for familial searches. Bizarrely, as Sarah Zhang has pointed out has become "the de facto DNA and genealogy database for all of law enforcement". Given that probably around 80% of GEDmatch users are in the US, it is likely that, in the short term at least, it will only be the American police using GEDmatch in this way, though that situation could change as the consumer genetic testing market continues to grow internationally.

There are still many issues to be addressed going forwards. I have many questions but no answers:
  • Can privacy policies on websites be legally enforced?
  • What oversight will there be of these searches to ensure that the police use these powers responsibly?
  • Will there be an ethics committee or some other supervisory body which will scrutinise applications for such searches?  In the UK oversight is provided by the Biometrics Commissioner and Forensic Science Regulator. Do similar bodies exist in America and in other countries?
  • What measures will be taken to ensure that searches are proportionate and that large numbers of innocent third and forth cousins identified from a crime scene sample are not brought into the police dragnet? In America this could mean that your cousin will be stalked by armed police in an attempt to get a discarded item to obtain a DNA sample.
  • Which genealogists are qualified to perform such searches and how can the police verify whether a genealogist has the necessary skills and will behave in an ethical way?
  • How will people feel if their DNA is used to falsely incriminate an innocent person? There have already been recorded cases of well meaning volunteer search angels misidentifying the biological parents of adoptees. The stakes are much higher in criminal investigations, and especially in US states which still have the death penalty.
  • Is there a case for the police to upgrade their own databases so that they use autosomal SNPs instead of STRs?
All these issues will be discussed in the months and years to come, and it will be interesting to see what happens. For now I think it's important that everyone gets their voice heard. What do you think?


William Sokka said...

I agree that GEDmatch are more straight and clear about the use by LE agencies. But I still see problems with GDPR since the terms of services don't comply to GDPR. Then the service risks suit and fine from EU (yet to see).

The purpose of GEDmatch is being a *genealogical* database with a clear objective - finding you relatives, to facilitate the genealogical research, be an open sharing community. It is not states as a purpose "LE agency DNA database".

GDPR is about a user to have control over the data, in this case a very sensitive data that is prohibited to process unless the user explicitly agrees to such processing. GDPR requires (1) that the consent is given freely, (2) the consent is given explicitly for each processing purpose, (3) user has the right to prohibit some processing.

(1) Giving the consent freely means that access to the service could't be conditioned to accepting all processing purposes. The user could be interested in the service, but not in receiving newsletters or promotions. According to the GDPR the user has the right to not agree with signing up for the newsletter and still use the service (the main point of interest for the user). The use of GEDmatch couldn't be conditioned to accepting LE searches, since it is not the purpose of GEDmatch and the user has to right to opt out such processing of the personal data.

(2) Explicit consent for each purpose.
Well how exact should you be in getting ticks for this and this use. It is not clearly defined, but there is a warning about shooting too broad like GEDmatch's present policy. But it is clearly that there should a consent form where a user gives an explicit consent to
(i) uploading and processing the genetic data for the comparison with other users,
(ii) displaying necessary contact information, as the first name and e-mail,
(iii) displaying additional personal information, as surname, birth date and birth place to the genetic matches,
(iv) displaying uploaded GEDcomm:s to the matches or making them fully public.
This list should include a item for explicit consent to use genetic data by LE agencies.
None of the items, except the first one, is necessary och could condition the use of the (genealogical) service.

(3) Right to prohibit a kind of processing
According to GDPR the user has the right to prohibit some kinds of data processing if the processing is unlawful. If GEDmatch doesn't allow the user to opt out, then they will probably be sitting with some EU customers that have the right to use the service and can demand that their data is not processed by LE.

Gedmatch has done a big step by updating their policy and be more transparent, but they are still force you to accept the all terms of use, if you want to have just some. Still, it is the user who should have control of the data usage and it is not so, unfortunately.

William Sokka said...

Posting a comment again. You have many good questions, one can write a book or a research about each.

Can privacy policies on websites be legally enforced?

Yes, I think so.

I think we can learn from the community developing Open Source software and their different licenses - GPL, LGPL, ASL (Apache Software License), BSD license and so on. Not to forget Creative Commons license.

So anyone releasing their genomic data to the public can release it under certain conditions. In the beginning of the GPL there was an uncertainty if it would work, but it hold in some cases, example:

So releasing the genomic data could licensed - it is indeed the DNA tester that should have control over how this data is used. Lets imagine - you are fully ok with releasing your genomic data publicly and approve all possible uses in the present and in the future, the release it under "Fully Open Genome Data Public License". If you are ok with only genealogical research, then you can release it under "Restricted Genealogical Research Genome Genome Data License". Do you want to give access to your genomic data to the scientists and genealogists, release it under "Limited ... License".

I think it is the possibility to enforce some privacy conditions. And I would like to see the genealogical community developing such licenses - a contract between a tester releasing the data and the public. The tester should have the control over data, not a web site, not an unknown LE agency. Every one in our community should have an option to contribute and have a choice how to contribute.

Debbie Kennett said...

Thanks William. You raise some interesting points. There would need to be a dedicated website that would allow genetic genealogists to upload their data and control the licensing.

Doesn't already allow users to license their genomic data?

However, even with open licences there is still the potential for misuse. I contributed some data to a Wiki. Another genealogist plagiarised my content and passed it off as their own without crediting the source. These infringements are difficult to enforce and patrol.

William Sokka said...

Thanks Debbie! As I see offers only only one possibility - you give away your genotype, anything could be done with it. But it should be one of possible alternatives.

Well, I understand your situation with the Wiki. Yes, it is hard to enforce and patrol for an individual.

Well, now I will sound as a freak, but... with such a genomic licenses we probably don't need to be so proactive. Think a case where the attorney of a suspect learns that dna search was performed on "The Very Restricted Genomic License"-data. All evidences are considered illegal, the case is closed and the suspect goes free. Such possibility should push back LE agencies.

Let's dream that ISOGG will hire a lawyer and suite the state for the infringement of the license. Well, getting a fine in the range on 10 or 20 millions euro is quite a repelling thing...

Debbie Kennett said...


These are intriguing possibilities! ISOGG is a rather strange sort of organisation because it doesn't charge membership fees and therefore it doesn't have any money. The advantage of not having money is that you can't be sued or fined, but it also means that ISOGG would not be a position to seek legal advice.

I suspect that in practice it would be difficult to license an individual genome but this could perhaps be done collectively on a collaborative website.

It would certainly be interesting if a foreign government were to be fined for breaching the privacy of EU citizens!

The two GEDmatch cases have yet to go to court so it will be interesting to see how the leads generated from GEDmatch searches will be interpreted and if the methodology will be considered valid and legal.

Martha Bowes said...

Bizarre is right! And I wonder if any perpetrators are learning about this and buying one-way tickets out of the country before changing their name?