Sunday 3 November 2019

Genotype extraction and false relative attacks: potential security risks at third-party genetic genealogy sites

Hot on the heels of a paper published the other week by Michael "Doc" Edge and Graham Coop on the possibility of attacks on genetic privacy via uploads to genealogy databases comes another paper by an independent team of researchers warning of another potential security risk.

The latest paper is written by Peter Ney, Luis Ceze and Tadayoshi Kohno, three researchers at the Paul G. Allen School of Computer Science & Engineering at the University of Washington. They caution about the risks of genotype theft and falsified genetic relations in the GEDmatch database.

I do not feel qualified to comment on the security risks they have identified so I will provide some links and let you make your own judgement.

The authors have provided some FAQs which provide a good starting point:

If you want read the full paper you can find it here:

The possible implications are also discussed in this article by Antonio Regalado in MIT Technology Review:

See also this report in the University of Washington News:

GEDmatch were given advance notice of the publication of the paper to allow them time to implement any necessary fixes. I understand that GEDmatch currently have measures in place that would thwart the method described in this paper but, understandably, they are not sharing the specifics. Further measures are also being investigated.

Note that this loophole affects GEDmatch only. The method won't work at AncestryDNA, 23andMe, FamilyTreeDNA, MyHeritage and Living DNA.

Update 4th November 2019
This research was also covered in New Scientist. You need a subscription to access the full article but here are some quotes from the end of the article:
"The study identifies a “clear risk” to the GEDmatch database, according to Graham Coop at the University of California, Davis, who wasn’t involved in the work. “I do worry that [GEDmatch are] not taking these concerns seriously enough. They have over a million people’s genetic data and they have placed these data at risk, which is incredibly concerning.” 
The risks could be easily solved by limiting genetic data uploads to DNA test results that are authenticated or digitally signed, says Ney. Better checks on uploads to detect anomalies, and restrictions on one-to-one comparison searches would help too, he says. His team alerted GEDmatch to the vulnerabilities before publishing and took measures to avoid exposing anyone’s identity. 
Curtis Rogers at GEDmatch says: “We are concerned about security and appreciate they have pointed out vulnerabilities.” He says the site has made several changes to address the vulnerability and is working on others, but didn’t specify what measures.
The article can be found here:


dB said...

I think you are eminently qualified to comment on these security risks, Debbie. Even I understand how to go fishing for snps using a fabricated snp kit, uploading to GEDmatch, harvesting the matches which will therefore have the snps in the original kit that was fabricated. Would there was some way around this, other than having everyone set their kits as research only. Am afraid there's not which could spell the end of GEDmatch. Please keep us posted and thanks for your great Cruwys news!

dB said...

I think you are eminently qualified to comment on the security risks. Even I understand how to fish for people with particular snps by fabricating a kit with the snps of interest, uploading to GEDmatch, and harvesting the matches which by implication will contain the snps fished for. I simply don't see a way around this risk other than everyone setting their kit as research only which would be the effective death of GEDmatch. Would be very interested the GEDmatch admins responses. In the meantime thank you for keeping us informed by your great blog.

Debbie Kennett said...

Many thanks for your kind words. There's been another development tonight but it's getting late and I will write about it tomorrow.