Sunday 19 July 2020

Major privacy breach at GEDmatch

There has been a major privacy breach at GEDmatch, the third-party genetic genealogy website which has become well known in the last two years because of its use by law enforcement agencies in the US to solve cold cases. A member of the Genetic Genealogy Ireland Facebook group posted a message at lunchtime today (13.38 pm UK time) to advise that the site had been compromised and that people were receiving what appeared to be fake matches with suspicious e-mail addresses.(This Facebook post has now been deleted.) Some users were reporting that they were receiving unusually large numbers of  new matches, all sharing unexpectedly high amounts of DNA which would normally indicate a very close relationship. In another group, one user reported receiving over 3000 matches, all of which shared over 700 cM. A match in this range would normally indicate a very close relationship such as a first cousin or closer.

Later on this afternoon (14.54 pm UK time) a user posted in the Genetic Genealogy Tips and Techniques group on Facebook that all his kits on GEDmatch were now publicly accessible and all marked as available to the police. This included not just standard kits but also phased kits and Lazarus kits,which are by default always marked as research kits and are not normally available for matching. I checked my own account at GEDmatch and found that all my kits had been changed without my consent to allow police access. This included two phased research kits which were never intended to be made public. I initially found that I was unable to change the settings on any of the kits. The site was up and down for a short while this afternoon before I was finally able to log in and restore my preferred access settings.

Since then GEDmatch has been offline with a message that the site is down for maintenance.
Many other people have also reported that their kits have been affected and that the settings have been changed to allow police access without their consent. Graham Coop shared on Twitter this afternoon a screenshot of his accounts showing how they had all been changed to allow police access..

It therefore appears that the entire database has been changed to make all kits available for police access. This also means that the law enforcement kits, which are normally uploaded as research kits so that they do not appear in match lists, have been compromised. Anyone logging onto the website during this period would have seen those kits and might have been able to save a screenshot with the kit numbers. Allowing unauthorised access to law enforcement kits could potentially have serious consequences and could compromise an investigation.

This is clearly a matter of great concern. There are well over 1.2 million profiles on GEDmatch but only around 200,000 or so kits had opted to make their profiles available for law enforcement matching. This means that the DNA profiles and e-mail addresses of probably around a million people have been exposed, including all the law enforcement kits. It is unlikely anyone would have been able to do anything with the matches during the period when the website was compromised because so many spurious matches were being produced. It is the exposure of the e-mail addresses and kit numbers which is likely to be of the most concern.

According to a report on the Tech in the City website the original privacy settings were restored before the site was taken down though I'm not clear what time this happened as I'm not clear what timezone the author is reporting from.

As GEDmatch operates in the European Union and has many EU customers, they are obliged to comply with the EU's General Data Protection Regulation (GDPR). Because of the serious nature of this breach it seems likely that they will have to report the matter to the appropriate regulatory authority in the EU. I don't know which authority they have registered with but the Information Commissioner's Office in the UK has information on how such data breaches should be reported. If a company or organisation has not protected the security of its customers than an enforcement action can be take and the company can be fined.

GEDmatch have since advised that they are aware of the issues and are responding. According to a post in the GEDmatch User Group on Facebook GEDmatch are "doing research right now to confirm what is happening. They are leaving the site down until they can clearly confirm what is going on." They are expected to make a formal statement later. It appears that this was an inadvertent update that went wrong. There appears to be no evidence that the site was hacked.

In the meantime it is pointless to speculate about what might have happened and we will need to await until further information is available. I will update this page if I receive any further news.

Just after publishing this blog post I discovered (22.51 pm UK time) that GEDmatch is back up and running and my kits all have the correct access levels.

23.09 pm The following message has been posted on the GEDmatch Facebook page.

Update 21 July 2020
GEDmatch have announced on their Facebook page that they experienced a security breach on Sunday which was orchestrated through a sophisticated attack on one of their servers via an existing user account. The site was functioning briefly yesterday but reports started coming in late last night that people were once again receiving lots of unexpectedly high matches with a low SNP overlap in their match lists. I was able to briefly log into my account at 1.00 am night and found that the kit I checked had lots of matches with users with words like "imputed" and "partial" in the names. My highest match was at the first cousin level with a user from the Chinese company Gese DNA. The site has now been taken down and GEDmatch are working with a cybersecurity company to implement new security measures. Here is a screenshot of the message from GEDmatch. I've removed the contact details from the post but these are are available in the full version of the message in the Facebook group. 

It is good that GEDmatch are being transparent about the problems and this may turn out for the best in the long run if the security of the database is improved. The site was down for at least three hours and although they say that no data was downloaded in that time it would have been possible to take screenshots of match lists from many different accounts. Once you have a kit number you then essentially have access to that individual's account. It is also a cascading effect because you can click on all the matches of the matches as well. This essentially means that all the kit numbers have been compromised because no one will know which kits were affected. All the kit numbers will need to be changed. Ideally it would be better if GEDmatch did not reveal kit numbers in the match lists. It will be interesting to see what happens but I rather suspect the site will be down for a long time.

Further update 21 July 2020
5.00 pm 
From the GEDmatch Facebook page: "GEDmatch will remain offline for 2 to 3 days as we further enhance security protocols. Thank you for your patience. We apologize for the inconvenience this has caused."

Update 22 July 2020
MyHeritage advised late last night of a security alert involving a malicious phishing attempt that was possible related to the GEDmatch breach. For full details see the MyHeritage blog post:

The further reading section of this blog post has been updated to include an informative blog post from Leah Larkin explaining why we were seeing the mystery matches at GEDmatch sharing unusually high amounts of DNA. I have also included an official statement from Verogen which was published on their blog on 20th July, a further blog post from Leah Larkin which includes a timeline of the events and an article from Peter Aldhous of Buzzfeed News..
An e-mail has been sent out by Verogen to all GEDmatch users informing them of the breach. My e-mail arrived at 8.40 am. It may take time for a bulk e-mail to reach all 1.2 million or more users. If you haven't received the e-mail check your spam folder. I've copied the text below in case you haven't received it.

Dear GEDmatch member,

On the morning of July 19, GEDmatch experienced a security breach orchestrated through a sophisticated attack on one of our servers via an existing user account. We became aware of the situation a short time later and immediately took the site down. As a result of this breach, all user permissions were reset, making all profiles visible to all users. This was the case for approximately 3 hours. During this time, users who did not opt-in for law enforcement matching were available for law enforcement matching, and, conversely, all law enforcement profiles were made visible to GEDmatch users.

On Monday, July 20, as we continued to investigate the incident and work on a permanent solution to safeguard against threats of this nature, we discovered that the site was still vulnerable and made the decision to take the site down until such time that we can be absolutely sure that user data is protected against potential attacks. It was later confirmed that GEDmatch was the target of a second breach in which all user permissions were set to opt-out of law enforcement matching.

We can assure you that your DNA information was not compromised, as GEDmatch does not store raw DNA files on the site. When you upload your data, the information is encoded, and the raw file deleted. This is one of the ways we protect our users’ most sensitive information.

Further, we are working with a leading cybersecurity firm to conduct a comprehensive forensic review and help us implement the best possible security measures. We expect the site will be up within the next day or two.

We have reported the unauthorized access to the appropriate authorities and continue to work toward identifying the individuals responsible for this criminal act.

Today, we were informed that MyHeritage customers who are also GEDmatch users were the target of a phishing scam. Please remember to exercise caution when opening emails and clicking links. Never provide sensitive information via email. If an email seems suspicious, contact the company in question directly through the phone number or email address listed on their website, not via a reply to the suspicious email. You can reach GEDmatch at  xxxx or xxxxx [email address and telephone number removed]. At this time, we have no evidence to suggest the phishing scam is a result of the GEDmatch security breach this week. We are continuing to investigate the incident.

Please be assured that we take these matters very seriously. Our Number 1 responsibility is to protect the data of our users. We know we have not lived up to this responsibility this week, and we are working hard to regain your trust. We apologize for the concern and frustration this situation has caused.


Brett Williams
CEO, Verogen Inc.

For a French translation of this e-mail see the post in the Facebook group France ADN - Généalogie Génétique (ISOGG).

Update 25th July 2020
There is a notice on the GEDmatch Facebook suggesting that the site will be back online today though at 11.35 am UK time the site was still down.

The site was restored in the afternoon of 25th July and no further issues have been reported to date.

No comments: