Tuesday, 14 July 2020

Some updates to AncestryDNA's matching system and a database update


Ancestry announced at a conference call today that there are some changes in the pipeline in terms of how our matches are reported. There will be three main changes:

1) Ancestry will provide a more accurate report on the number of segments shared with your matches. The updated matching algorithm may reduce the estimated number of segments you share with some. of your DNA matches. However, it won't change the estimated total amount of shared DNA (measured in centimorgans/cM) or the predicted relationship to your matches.

2) Ancestry will report the length of the largest shared segment. This is particularly important for people who are descended from endogamous populations. Knowing the length of the longest segment you and a DNA match have in common can help determine if you’re actually related. The longer the segment, the more likely you’re related. Segment length is also the easiest way to evaluate the difference between multiple matches that all show the same estimated relationship.

3) The matches will be re-calibrated to remove false matches so that the reported matches are more likely to be related through a recent common ancestor. Once the update is implemented, only matches which share 8 cM or more will be reported. Ancestry estimate that this will remove about two thirds of the false matches. All matches that fall below the new threshold will disappear from your match list with the exception of matches you have messaged, matches where you've added a note and matches you have added to a group by using the system of coloured dots. Starred matches will also be retained as they are considered part of a group. If you save a match below 8cM, your match will also have it saved without additional action needed. Any matches sharing less than 8 cM in total will no longer appear as common ancestor hints or in the ThruLines feature and this change may affect the number of ThruLines you see. If you want to save these matches you'll need to make sure you add them to one of your groups or add a note. Note that it is only the total cM shared after the application of the Timber algorithm that is affected so you could still have matches which share some individual segments that are smaller than 8 cM so long as the sum total of all the segments is over 8 cM.

On site messaging will start to appear on the site in the next few days (this messaging is now live) to alert users to the updated matching system and a new matching white paper will be available later this week. (The White paper has now been published and can be accessed here.) We can expect to see the new matching system rolled out in early August.

The increase in the match threshold will mean that many matches will disappear from our match lists. However, in practice, this is not going to have any effect on our genealogical research as these small matches have proved to be so unreliable that they are impossible to work with. The last time I analysed my matches at AncestryDNA and compared them with my parents' match lists I found that 54% of my matches in the 6-7 cM range did not match either of my parents and were therefore probably false positives. (1) Clearly if there is over a 50% chance that a match will be false we cannot reliably assign these matches to a common ancestor, even if we can identify one in our shared family trees. Even if the match is real, the chances are still very low that it will be a reflection of a recent genealogical relationship and it is far more likely to be the result of very distant sharing. (2)

I currently have over 32,000 matches at AncestryDNA which is far more than I can ever possibly cope with. However, if you really are desperate to go through your matches and check the 6 and 7 cm matches before they disappear you can use the filter under Shared DNA to set a custom cM range to identify these matches.

In other news AncestryDNA's corporate page has been updated to show that they have now tested 18 million people. AncestryDNA now have by far the largest genetic genealogy database in the world. 23andMe is the next largest with a database of 12 million people. MyHeritage have 4 million people in their database, while FamilyTreeDNA have tested over two million people. (3)

The lockdown seems to have encouraged a renewed interest in family history so we can also look forward to receiving many more matches in the months and years to come.

Update 4th August
The roll out of the update has been delayed and will now be rolled out in stages. You will find full details, including FAQs, when you log into your AncestryDNA account.



Ancestry is now displaying decimal points for all matches sharing under 10 cM. All matches sharing under 8 cM will be removed at the end of August. This includes matches in the 7.5 to 7.9 cM range which were previously rounded up to 8 cM.

Further reading
Footnotes
1. See my blog post Comparing parent and child matches at AncestryDNA from August 2017 for the full details of this analysis.
2. See the ISOGG Wiki page on identity by descent which includes a chart from a 2015 paper by Doug Speed and David Balding providing the distribution of different-sized segments by generation.
3. FamilyTreeDNA do not publish details of the size of their autosomal DNA database. The two million figure about the number of people tested is taken from the FAQs on their home page. In the section headed "Who is FamilyTreeDNA?" they say: "Over 2 million people have tested with FamilyTreeDNA, resulting in the most comprehensive DNA matching database in the industry." FTDNA used to publish daily updates on the number of Y-DNA and mtDNA records in the database on their "Why choose FamilyTreeDNA page?" However, the figures on this page have not been updated since July 2019. Martin McDowell did an analysis in February 2020 based on FTDNA kit numbers in which estimated that FTDNA's autosomal DNA database was approaching two million. See the blog post "How big is the FamilyTreeDNA database" on the Genetic Genealogy Ireland website.

Updates
This page was updated on 15 July 2019 to include a third footnote to clarify information about the size of the FamilyTreeDNA database. It was updated on 16 July to include a link to the updated AncestryDNA white paper and a further reading list. It was also updated to clarify that starred matches will not be retained. The page was updated on 17 July to include a link to blog posts from Blaine Bettinger and Leah Larkin. The page was updated on 19 July following the receipt of an e-mail from AncestryDNA which clarified that starred matches would be retained after all and that any matches you save will also be automatically saved on your match's account. Additional information was added to the number points 1 and 2 with additional information from Ancestry about the changes in the reporting of segments. A link to Judy Russell's blog post was added on 28 July.

66 comments:

Robin in Short Pump said...

I am alarmed at the thought of losing my 7cM matches. I have so many that ARE valid. So, is colored-dot grouping the only way to keep them from disappearing? What if I've got them linked to someone in my tree?

Debbie Kennett said...

The matches will also be retained if you've messaged the person or added a note to the match. The vast majority of these matches are not valid. Being able to identify a common ancestor with a match does not make it valid.

Ed said...

I have about 150,000 matches for Ancestry DNA. I will be glad that some false ones will be gone. I wish they would show X-matches and have a chromosome browser.

alexander993 said...

You state: "matches you have added to a group by using the system of coloured dots" - are the "starred matches" covered by the phrase coloured dots too?
Christine.

Gene Prescott said...

I assume the 2 million at FTDNA are their atDNA tests (Family Finder) and don't include their YDNA tests. Do you know current total for their YDNA tests?

Debbie Kennett said...

Gene,

FTDNA have a page on the website which specifies the number of Y-DNA and mtDNA tests but it's not been updated for over a year. In July 2019 they had 749,555 Y-DNA records in their database:

https://www.familytreedna.com/why-ftdna

The 2 million figure is taken from the FTDNA home page in the section "Who is FamilyTreeDNA?" and it refers to the number of people tested but doesn't provide a breakdown by test.

Debbie Kennett said...

Ed

That's an astonishingly large number of matches. You certainly don't need to worry about about the 6 and 7 cM matches. I'd like to see all the companies make better use of the X-chromosome. While a chromosome browser would be useful it's not essential and it can lead to people misinterpreting their results. Having more accurate data on the number of segments is going to be a big improvement.

Christine

I presume that starred matches will be saved but I don't know for sure. I would hope that Ancestry will provide further information in the near future to clarify some of these issues.

Unknown said...

Not all those small matches are false and I see no reason why not to continue showing them. People decide whether to follow through or not. What I would most like to see is the showing of shared matches regardless of match size. The amount under 20cms I have and have no idea where they belong is a constant frustration to me.

Unknown said...

I have found a pattern of matches 'THAT ARE NOT' identified in Ancestry as it currently presents. Currently I used a co-ordinated approach to seek to correlate and then cement my matches. Confirming each of these matches takes an inordinate amount of time, let alone rely of the sharing of DNA to the various sites to get access to the ability to correlate potential links. I don't believe I have the time/resources to 'catch' any or all of the potential people that I will eventually match. From what you have posted, I don't even know how I could 'protect/even identify' the connections that I want to maintain in the timeframe that is now in operation (and reported here).

Each of the various 'providers' use different algorithyms to generate their matches. So I wondered if there any way (known) of understanding/appreciating how these different approaches interact with the DNA (provided) to generate 'matches'?

If the differences/similarities could be identified, then one could more efficiently work out how to work to organise the various clues to confirm matches.

A recent example was a 9cM 'match on GEDmatch realised a nil match on Ancestry. However, knowing the match, and having some of the background (and using the tree built on Ancestry) I was able to work out the paternal match.

In another example, a maternal match for my wife showed no matches to one person. However by consistent research and following a series of shared matches found that my wife and the (unmatched on Ancestry person) should match at somewhere between 2-4C level.

Will there be anyway of being able to revisit the 'old system' once they make the coming change 'operational?
Thanks Jim

Unknown said...

I too am upset that I will be losing my 6 & 7cM matches that have common ancestors or Thruline suggestions. Just because the DNA match is small or doesn't even match at all, doesn't mean they are not a relative. I have a 3rd cousin with whom I do not have a DNA match (and we're told that about 10% of 3C will not share enough DNA to match) but I do share DNA with some of his cousins, so he is definitely related, and I have learnt so much by communicating and collaborating with him. Now admittedly I did not find him through DNA, obviously, but the principal is the same. I now have communication with some 6 & 7cM matches that I found using Thrulines, and whether the DNA match is valid or not, the relationship is. I'm currently marking all of those matches, but sad that I won't ever get any more.

Chris Greene said...

I did my DNA test on Living DNA, I wonder when Ancestry will allow me to upload these results rather than have to pay for their DNA test?

pbear12 said...

I have a lot of common matches at 7cms. I am very disappointed in ancestry.

Justme said...

That boggles my mind that Ancestry is larger than 23andme, yet 23andme can more correctly identify my DNA (verified from records over hundreds of years)

M.Bolam said...

Bad news for european ancetry dna users. As a typical european user u only have 5-10.000 total DNA matches.
So u work with small matches to.
When ancestry wil grow in the european DNA market, then they better let the small matches in the list. The Timber allgorithm allready delete enough matches. You can see this when you check two ancestry test one to one at gedmatch. Wheb 50% of the small matched unter 8cM wrong then still 50 are true matches. No need to remove the true matches.when ancesrtry want help users give them a chromosome browser to check how many SNP the small cM matches have.

Unknown said...

Concerned that they are filtering out below 8 cm. I have many matches that are legit below this level which are linked my tree. You are removing the serious researcher a choice. The problem is that inexperienced researchers see these as nuisance matches as the don't have the experience to make their own decisions. A negative action in my opinion. Bill Marsh New Zealand.

Unknown said...

Is there a way if I can tell if my matches are either maternal or fraternal.

Annie said...

Is there a way to bulk add the small matches to a group? I manage 11 kits and can't possibly save them all one by one.

I find the smaller matches invaluable. Here are a couple of examples. I have about 4 separate matches to people descended from 6 x grandparents of my dad. I was sceptical because of the distance but these people transferred their DNA to Gedmatch and MyHeritage and I'm building a picture of how chromosome 2 is filling up with descendants of this ancestral couple. Without the small matches, I couldn't do this.

Second example. ThruLines is finally becoming useful to me as closer cousins are testing. My mum has a 17 cM match to someone but, until recently my aunt, nor another known cousin matched them. The other day a new match appeared showing a 6cM connection to the same person which has given me confidence in the initial hypothesis. I wouldn't have had this under the new system.

HELLY117 said...

FTDNA are currently running 3 further tests on my late father's sample from 2007, worth knowing they keep these for 25 years!!!

Pie Wummin said...

I have several smaller matches who are actually more closely related than the DNA indicates. One in particular is a great grand-daughter of my maternal grandmother's half brother. She shares over 100cMs with my maternal aunt (mum's sister) but shares only 7cMs with me. She is my half 2nd cousin once removed and a half first cousin twice removed of my aunt. She was one of the first people to contact me through Ancestry because she and her mother shared so much DNA with my aunt. I do have her saved to that particular family group, but I think it proves that beyond 2nd cousin the amount of DNA shared becomes so much more unpredictable. It sometimes happens that you find a third cousin with very little or no DNA in common, but they have relatives who share a good deal more with you and you also know who the common ancestors are. I am again speaking from personal experience. I have a third cousin (well documented and supported by paper trail genealogy and DNA) who shares no DNA with me, other than a very small segment on Gedmatch, but shares a large amount of DNA with my first, second and third cousins. So it is problematic and I am a bit concerned about losing matches with very small segments in common that might actually, in reality, be much more closely related than the DNA specifies.

Unknown said...

Very disappointed Ancestry.com.Perhaps sorting out the “false” 50% matches makes more sense.

Unknown said...

It does not prove it to be valid, but it increases the likelyhood of it being valid considerably, while there is no evidence whatsoever for the broad claim that "the vast majority of these matches are not valid". In my opinion, it is a lazy claim by people who do not research hard enough, and have no idea how to tackle large numbers of matches. Removing 6-7cM matches will remove 25 percent of my clusterable matches. I am from Europe and I rely on every single match that has shared matches at all. Often, these are the only ones who have sizeable trees attached, so they are the ones who provide clues to what a cluster of matches stands for. Ancestry's decision to remove these matches is a lazy cop-out for their load problems, simply because they are not willing to invest in a better cloud infrastructure. It has nothing to do with making life easier for customers. But it does make research harder for those of us who are capable of bringing order into the overwhelming number of matches. Personally, I could identify over 250 matches with my own clustering algorithm. It led to 96 ThruLines matches, only one of which was found to be wrong. Ancestry's decision is condescending at least, and downright hostile to those doing serious research. And regarding the load problems, I have told Ancestry over and over again that they should implement a thumbnail cache for their stupid infinite scrolling match list. When you managed to scroll to the end of your list, you will have downloaded several gigabytes of data, 99.8 percent of which are full-sized images, sometimes in the 12 megapixel range. It is so incredibly stupid, it just boggles the mind. Ancestry should tackle this problem instead of trying to solve it from the wrong end by withholding 30-40% of the useful information. They must sureley hate their customers.

Slrphoto59 said...

I have about dropping the smaller numbers also.as my last message to a family member was very valid. I would a valuable resource.

Slrphoto59 said...

This is so valid of a point. I would have also lost several family members had this happened sooner. Some are as close as 3rd cousins. This is very disappointing.

tekni2 said...

So if a match with 6-7 cM has shared matches with people at 35 cM and a green leaf these are still being deleted? I have gotten many clues from matches like these.

girlclifford said...

I have called and asked Ancestry.com to extend the time to 31 December 2020 to give us, their customers, time to mark the ones that are between 6 & 8 cMs (I also used the filter Common Ancestors to lower the number of matches to be marked). The matches can only be marked individually not collectively so giving only till 31 August in the summertime is not a great deal of time. Perhaps if enough people call and affirm this request, we could be granted an extension. I agree, for those of us with large trees who do the paperwork/document trail, this is a huge hit....maybe not so much for the weekend warriors who just do it for fun.

Fax said...

This is upsetting to me. While I never use small matches in isolation, in congregate with larger matches and the additional trees that sometimes accompany them, I have solved several mysteries. Timber alone has been a stumbling block by ignoring matches to testees in other company databases. This will be a further degradation of Ancestry's services.

Carolynn said...

Black folks NEED those distant matches. This is white supremacist policy. It does not take the needs of descendants of the enslaved, who have small paper trails, at all.

Retired Coder said...

First, I doubt that this decision has much to do with confused, overwhelmed newbies. It's about money. The larger the database, the more 6-7cM matches everyone is going to have. The more customers Ancestry has, the more searches they'll have where they have to return these small matches. The additional load costs money, whether Ancestry is doing it themselves or using Amazon Web Services or another cloud service. Dropping 6-7cM matches protects the investment of the equity funds that own Ancestry.

Second, I'll grant that thousands of my 6-7cM matches aren't valid or worth pursuing. I'll even allow that someone that shares a common ancestor with me may not be a true DNA match at the 6-7cM level. BUT that distantly related cousin with an identified common ancestor is EXACTLY the type of person I'm looking for. I've got many fifth great-grandparents that I know very little about, and I don't care how I find descendants-- through message boards, Facebook, serendipity, or even through an "invalid" DNA match. I'm not collecting "true DNA matches," I'm doing genealogy. That distant cousin has value, even if their DNA connection is iffy.

I can understand Ancestry's motivation, but they're throwing out dozens and dozens of babies with the bath water.

Harry Angus said...

So Ancestry will give a more accurate number of the segments our matches share with us? I do not see how knowing how many segments someone shares with me would be useful if I don't know which chromosome number or the placement of that segment on the chromosome. Since Ancestry does not have a chromosome browser, this would be useless.

I also heard they will eliminate matches of 7cM or less to reduce false positives. Eliminating false positives do not help at all because they will eliminate a lot of matches with big trees that sometimes are the best to get you ancestors from a long time ago. Those potential false positives are very useful because based on the last names they share with us, we can make our own determination if they are false positives or not.

Ancestry is all hot air. They are charging so much and lack what all other 3 companies give for free: a chromosome browser. I have not seen a new match in 3 months, so I am guessing they are not getting a lot of new business. I can only hope they do something that makes sense to their customers paying them a monthly subscription, and not receiving the basic services offered by their competitors.

If 23&Me add a tree to their site, Ancestry will start losing a lot of their paying customers. They may continue to get new people buying kits at $75 who will never spend a penny more on the site, but they will lose those who are paying months after months.

Bonnie said...

Ancestry site is particularly slow today,probably because we are all scrambling to Save Our 7s!

pbear12 said...

I manage 19 kits and there is no way I am going to be able to save all of the connections I have found in the 6cms -7cms matches.

Tina said...

Carolynn, please don’t make this into “white supremacy”. My ancestors were relatively recent immigrants and I very much need The small matches also.

Daniel said...

I see pros and cons to this maneuver. The Rule of Thumb here is segments lower than 7 cMs can produce many false-positives. Ancestry uses the minimum 6 cMs and has given many users tens of thousands of matches. With the largest database of atDNA in the market, it would make sense to be a little more precise. Thankfully they aren't taking the 23andMe precision into account and stopping at 20 cMs.
I'll be labeling all my 6 cM and 7 cM matches that also show a Common Ancestor for a later date to review. If a common ancestor is listed, I'd at least like to have the opportunity to investigate and see if, in fact, we are related. so it's color cluster and repeat for them.

Philip Cowen said...

Now all ancestry needs to do is add a chromosome browser. It really helps validate matches to what appears to be a common ancestor.

Unknown said...

This is really BAD NEWS!!! If you have a lower cm match that has a well developed tree, it can go back to the late 1700s and early 1800s. If they have shared matches with you, they can introduce you to new family surnames that you never knew about. The number of cms and segment size diminishes naturally over the generations—larger segments from Grandparents will be considerably smaller in length today due to this loss—but those segment bits go back to purer/truer family line from the old countries. Like the old Sicilian families that actually lived in Sicily, before people left in droves to come to America, where they married non-Italians, and the Italian DNA fades out after several generations. SOOOOO SAD!

sgrehome said...

My guess is never - they don't allow uploads from any other companies. Btw, I too am a Greene.

Unknown said...

This is an incredibly bad idea for people like me who are project managers of a surname. I have over 200 Ancetry kits shared to my account from known descendants of the surname project I manage, Marrinan. This is an incredibly unusual surname from a very confined area of western Co. Clare. I am aware of more than 500 Ancestry DNA kits of known Marrinan descendants. I maintain the genealogical and DNA databases for the surname. I routinely recommend Ancestry DNA to people because it is, by far, the easiest to work with. I know from Y DNA testing that all Marrinans descend from one common ancestor, so a match as low as 6 cMs, for me, is a valid match that I can work with. I don't want to be a manager of those DNA accounts, which would allow me to use your tools in this article to preserve the match. I just want to v a viewer of the kit so I can monitor it for new matches for people with that surname in their ancestral origins. To lose a 6 cM match in this project is going to be catastrophic. Please rethink this or come up with a way to allow me to presereve the matches in kits shared to my account.

Unknown said...

I am disappointed that you are taking this position as I have found true matches at 6 and 7 and when comparing to some direct family members we can see that chr data to be the same even when we don't meet the new threshold of 8 cMs. I believe you are being short sighted as dna matches have to be worked hard to prove the opposite that you are indeed related. Beside when you do the test the results are really random as you are not identical in matches even with your siblings and there has been a number of times that I haven't had a match with people whom I know to be second cousins but I am more than willing to dig deeper to see how I may compare with my brother for example who shows up as a larger match to those same people. The system needs more flexibility not less.as dna is not yet the perfect science, it is moving in that direction but not there yet.

georgiajohnston said...
This comment has been removed by the author.
georgiajohnston said...
This comment has been removed by the author.
georgiajohnston said...

Email is georgiajohnston@comcast.net if you have questions.

AmiableAngel said...

On the 15 matches I have with identifiable shared ancestry via ThruLines, 4 of them are 6-7 cm. This is a huge loss for African Americans researchers, who I contend may have a lot of half-relationship during slavery, and takes away a lot of the value of ThruLines.

Linda said...


All you people:
Check out Diahan Southard's blog re this subject and her "how to" video. She demonstrates a much easier and faster way to color code matches.

https://www.yourdnaguide.com/ydgblog/2020/7/15/ancestrydna-shared-matches-updates
Linda J

Debbie Kennett said...

This post has attracted an unusually high number of comments and I will do my best to respond here. I will split my replies over several posts.

I’ve now updated the blog post with a link to the updated Matching White Paper. I’ve also updated the post to clarify that starred matches will not be saved.

In answer to Jim’s question about the match thresholds, there is a page in the ISOGG Wiki which has information on this subject though it’s currently missing data on MyHeritage:

https://isogg.org/wiki/Autosomal_DNA_match_thresholds

I also recommend checking out this YouTube presentation from Blaine Bettinger which explains why the cM total can vary so much from one company to another:

https://youtu.be/Veb6vn_BFRY

AncestryDNA have the most advanced and reliable system for detecting matches whereas GEDmatch has a very primitive system. If you have a match at GEDmatch but not at AncestryDNA then the match at GEDmatch is probably not valid.

It’s important to remember that finding a genealogical connection with a small-segment match does not provide any evidence that the match is valid. One day we will be able to determine the validity of small matches but for that to happen we will probably need whole genome sequencing or chips with many more SNPs than are available on the chips currently used by the different testing companies.

My understanding is that there will not be an option to revert to the old system at AncestryDNA.

It is always helpful when a DNA match, even if it is a false match, leads you to find new genealogical connections and helps you to advance your research but you don’t need the matches in order to search the family trees on Ancestry. If you’re restricting your searches only to people on your DNA match list you’re missing out on lots of potentially useful connections.

In answer to Chris Greene’s question, AncestryDNA does not accept uploads so if you want to be in the AncestryDNA database you’ll need to take a new test. You can, however, upload your Living DNA raw data to MyHeritage DNA. FamilyTreeDNA accept uploads but unfortunately they don’t currently allow uploads from Living DNA.

Debbie Kennett said...

In response to M Bolam’s comments about European users can I suggest you transfer your AncestryDNA data to MyHeritage DNA to give you more matches:

https://faq.myheritage.com/en/article/how-can-i-upload-a-dna-file-to-myheritage

MyHeritage have a website available in many different languages and have been able to attract lots of new users in continental Europe, particularly in non-English-speaking countries. It's also worth transferring to FamilyTreeDNA if you've not already done so as they also have a good user base in many European countries.

The problem with the small matches is that even if they are real matches they are far more likely to date back 10 or 20 generations and so they are simply not relevant for confirming genealogical connections.

In response to Bill Marsh, being able to find a genealogical connection to a 6 cM or 7 cM does not make it any more real. Experienced researchers recognise the dangers of trying to use low matches. It’s much better to focus on your top matches in the fourth cousin and closer category.

In response to the anonymous poster, If you have tested your parents then Ancestry will divide the matches onto the maternal and paternal sides but they will only do this for the fourth cousin and closer matches. If you can’t test your parents it helps to test other family members such as aunts, uncles, first cousins and second cousins. You will then be able to use the Shared Matching Tool to work out which line the match is on.

That’s all for tonight! I will respond to the remaining comments tomorrow.

Untilthattime said...

Not happy with this at all. I have plenty of 7 cm matches that have proven shared ancestors.

Unknown said...

Anyone who says that these 6-7 cM matches are not meaningful is not using AncestryDNA to its fullest. While a random 6-7 cM segment may have a 50% chance of being false, it also has a 50% chance of being real. When one of these 6-7 cM matches is in a cluster of matches or also matches another known relative, it increases chances greatly that the match is legitimate. There are many small matches that have greatly helped in my research and taking them away will hurt Ancestry users' ability to find information about their ancestors. I'm sure this has everything to do with computer processing/storage, which in the end means money, and it's very unfortunate Ancestry is taking this away under the guise of doing us a service by removing bad matches.

Retired Coder said...

Debbie Kennett said, "It is always helpful when a DNA match, even if it is a false match, leads you to find new genealogical connections and helps you to advance your research but you don’t need the matches in order to search the family trees on Ancestry. If you’re restricting your searches only to people on your DNA match list you’re missing out on lots of potentially useful connections."

No, you don't need matches to search family trees on Ancestry. BUT with ThruLines/Common Ancestor hints, Ancestry will connect say Mary's tree with Elizabeth's tree, which connects to stubby trees from Bill and Harry, who only go back to their parents or grandparents in their trees. Bill and Harry match me at 6 or 7cMs. Yeah, I could find Bill and Harry in a tree search if I know all the dozens and dozens of 2nd or 3rd great-grandchildren of a 5th great-grandparent, but frankly, I don't. I doubt that anyone here does. I'm not going to find Bill and Harry searching trees, I'm not going to know of them because of the unreported DNA match, so now I can't ask them what they know of the family line. (Before someone argues that they're not going to know anything since they only go back a couple generations in their trees: yeah, probably. But maybe they just got tired of entering their tree, or maybe they could refer you to a relative that knows more, or maybe they have an old Bible in the cedar chest, etc. You don't know if you don't ask. And you can't ask when you don't know they're there.)

Glynne said...

I have my ancestryDNA uploaded to myheritage as well so occasionally a new match comes through on that. One thing I do like on myheritage is that where they show SHARED matches they also show how many cM are shared from each side. So if A shares 20cM with B, but C shares 100cM with B, then it is obviously worthwhile communicating with C (especially if B has no tree).
Can we persuade ancestry to do a similar comparison.
Glynne

Unknown said...

I have read the whitepaper now, and it says the new minimum segment length is 8cM, whereas it was 6cM before. That, my friends, means that we will not only lose matches below 8cM. We will lose a lot of the higher matches as well. For example, a confirmed 4th cousin 1x removed matches me with 18cM in 3 segments, so by AncestryDNA's previous white paper, all three segments must have had the minimum length of 6cM. These are all no longer considered by the new filtering algorithm - the match will disappear.

Robin in Short Pump said...

I looked at the whitepaper, which is way over my head, but I don't see what you're seeing. Can you point to the section/paragraph in the whitepaper that specifically says we would lose these higher matches under the condition you describe?

I just hope you are taking a too-literal interpretation of something. Because how would we ever know that an 18cM match has any segments below the new minimum of 8 since they aren't giving us that information yet?

Debbie Kennett said...

I've now updated my blog post to include a link to an excellent article from Blaine Bettinger about the changes at AncestryDNA which I highly recommend reading.

https://thegeneticgenealogist.com/2020/07/17/losing-distant-matches-at-ancestrydna

The notice about the updated matches is now showing on my AncestryDNA account and there is a link to some FAQs which are also well worth reading.

Debbie Kennett said...

Annie

I’m afraid I don’t know of any way to add small matches to a group.

The fact that so many small matches are either not valid or very distant means that they cannot be used to test hypotheses about relationships. The fact that you share small matches with known relatives does not provide any additional evidence for the relationship.

AncestryDNA has by far the most reliable and scientific matching system. The fact that a match is showing up at GEDmatch and MyHeritage but not at Ancestry should throw up red flags. It is also highly suspicious that all these presumed matches are on chromosome 2. We would expect matches to be evenly distributed across all our chromosomes and not all piling up in the same position on the same chromosome. It sounds to me as though Ancestry’s Timber algorithm is doing what it should do and is downweighting what are known as pile-up regions which are not likely to have any genealogical relevance.

https://cruwys.blogspot.com/2018/01/small-segments-and-pile-ups.html

Unknown said...

Robin,

the information about segments below 8cM being discarded is on page 12 in the whitepaper, step no. 5 in their workflow. This threshold was previously at 6cM.

So, any current match with an average segment size of less than 8cM will be affected by the update. Matches consisting of all small segments below the 8cM threshold will disappear, if AncestryDNA mean what they state in their whitepaper.

Unknown said...

I'm sorry, but the author is dead wrong about how it is "for the greater good" to lose legitimate matches. This change makes me really angry. All those people who grew up knowing their genetic parents, grandparents, etc, maybe they don't miss their 83000 distant potential cousins. But for me it has made all the difference in the world. yes, I do know by the math that most of them are bad matches, or at least matches I will never be able to resolve. What I AM interested in is the perhaps 100 that are a match AND have enough tree information to fall into my Thrulines. I do realize that plenty of these people have bad info. I want to *SEE THEM* and the proposed ancestral links so that I can sort through that info, back it up with sources, and put clean information into my tree. This change makes me absolutely furious, and you have place at all to say I am better off without it, just because the majority are bad matches or can never be linked to my own genetic tree. My subscription ends in August. If I lose this information that has made me so happy to spend go knows how many hours researching, sourcing, cleaning up, Ancestry loses my money.

Debbie Kennett said...

In response to Pie, You’re quite right that there’s a wide range in terms of the amount of DNA shared for different relationships and the further back in time the more pronounced this effect is. That’s why it’s so important, as you have done, to test other family members as they often pick up matches that you don’t. However, sharing a 7 cM match with a known relative still doesn’t make the match valid. The chances are still quite high that this is a false positive match or this match is a reflection of another more distant and undocumented relationship rather than the one you've identified. Nevertheless you have confirmation of the relationship from your other relatives who do match your cousin. If someone matches you on GEDmatch but not at Ancestry then this should raise a red flag. It is highly likely that with Ancestry’s more accurate system of IBD detection they have been able to eliminate a false match and that it is GEDmatch that is wrong and not Ancestry.

In response to the anonymous comment which states “it is a lazy claim by people who do not research hard enough, and have no idea how to tackle large numbers of matches”, I’m afraid that no amount of experience at genealogy research is going to overcome the established scientific problems with these small segments. I am also from Europe and have nowhere near the number of high-resolution matches that most Americans have but I see no point in spending lots of time chasing elusive rainbows. About two thirds of these small matches are false matches and most of the rest are very distant so they cannot in any way constitute 30% to 40% of the useful information. I would far rather see more people in the Ancestry database from Europe than have access to all these small matches.

Ancestry do say in their white paper that “We have devoted much of this document to describing how we analyze an individual’s genotype to detect all IBD segments (greater than 8 cM) in our database in a way that balances accuracy and computational efficiency.” So undoubtedly this change has been made because of the computational load. Don’t underestimate the computing power needed to compute matches for 18 million people which requires trillions and trillions of comparisons. Any system has to be viable as the database grows.

In response to the many people who are claiming that their 6 and 7 cM matches are valid, I’m afraid that there is simply no way at the moment to determine whether or not these small matches are valid. If you are able to test your parents, as I have been able to do, it does at least allow you to eliminate about 50% of the false matches that don’t match either parent. But that doesn’t mean that the remaining matches are valid. Even if they are IBD they are far more likely to date back 10 or 20 generations or more. It is my understanding that only 6 and 7 cM matches will be removed so you will not need to save all your 8 cM matches.

Debbie Kennett said...

In response to Carolynn, the scientific process of inheritance and recombination is the same for everyone regardless of their ethnicity. However, much more work needs to be done on non-European populations. It is known, for example, that recombination rates and hotspots are different for people of African ancestry so it’s quite possible that if genetic distance is being measured on a genetic map based on European samples that it won’t necessarily apply to other populations. However, none of this will overcome the problem with small segments which apply equally to everyone. For accurate reading of small segments we either need microarrays with many more SNPs or we need whole genome sequencing. There is a great danger that African Americans could be given false hope because they are relying on these small false matches. It's worth reading this blog post highlighted by Blaine Bettinger on the African Roots blog which cautions against wishful thinking and tunnel vision:

https://tracingafricanroots.wordpress.com/2017/05/10/how-to-find-those-elusive-african-dna-matches-on-ancestry-com/

Debbie Kennett said...

In response to Retired Coder, I think Ancestry's decision was based on both a desire to provide more accurate matches and to encourage better genealogy and also the need to conserve computing power. The more customers Ancestry have, the more high-level matches we will get so people won’t be tempted to go exploring down in the weeds where they can easily get led astray.

In response to Harry Angus, knowing the actual number of segments will be very useful. Matches sharing more than one segment are generally going to be more recent than matches sharing a single segment which can often go back a very long way. Knowing the length of the longest segment will be particularly useful for those who descend from endogamous populations and will help them to prioritise their matches so that they focus their energies on the matches that are more likely to be relevant. The trees are not just accessible to DNA matches. If you want to access large trees all you need to do is search the trees on Ancestry. This is far more effective than just searching trees attached to small matches where you're only accessing a small subset of the available trees. Chromosome browsers are for the few not the many. Very few people use them or understand how they work. It is no accident that AncestryDNA has the largest database and doesn’t have a chromosome browser. They are a source of considerable confusion and also lead to bad genealogy research because people misinterpret the matches and read meaning into patterns. That doesn't mean that a chromosome browser might not be helpful at AncestryDNA but it would require a huge educational investment to ensure that the tool was used correctly.

In response to Daniel, 23andMe does not have a cut off at 20 cM. You are restricted to your top 2000 matches and this includes those matches who have not opted in to DNA Relatives. I have some matches at 23andMe that only share 6 or 7 cMs.

In response to Philip Cowen, chromosome browsers only let you see where you match a person. They do not provide any tools to allow you to determine whether or not a match is valid. The best way to determine if small segment matches are valid is to see if the match is still maintained with whole genome sequencing but that is not a viable option at present.

In response to the unknown commenter with ancestry from Sicily, unfortunately having a well developed tree has no bearing on whether or not a match is valid.

In response to the manager of the Marrinan DNA Project, having known genealogical relationships from a known Marrinan descendant is no guarantee that a small 6 cM match is associated with descent from this ancestor or that the match is valid.

In response to unknown who says “I have found true matches at 6 and 7 and when comparing to some direct family members we can see that chr data to be the same even when we don't meet the new threshold of 8 cMs” I’m afraid that, as I’ve repeatedly emphasised, we currently have no way of determining whether or not these matches are “true”. Even if they are valid, they are far more more likely to trace back for 10 or 20 generations. The amount of genealogical research you do has no bearing whatsoever on the validity of these small segment matches.

In response to Amiable Angel, African American researchers are not the only ones who descend from half-relationships. Small matches are mostly false matches in all populations though of course they can still be helpful if they lead to genealogical connections. The concern is that the matches will be misinterpreted leading to people drawing false conclusions from the DNA data. However, there’s no harm in saving the small matches if you think they might help.

Linda, I thought Diahan did a very good blog post and I’ve now included a link in my further reading section.

Debbie Kennett said...

Glynne, Can I suggest you contact Ancestry with your suggestion.

In response to Robin and the anonymous comment about the Ancestry white paper, the paper says the following “An important feature of our method is that we do not keep track of all matching segments; in step 5, we filter out a candidate match if its genetic distance is less than 8 cM.” The number of segments reported by Ancestry is currently very unreliable so a fourth cousin who currently matches on 18cM across 3 segments might turn out to be a fourth cousin who matches you on a single 18 cM segment under the new system.

In response to the anonymous commenter with 83,000 matches, I think we will all benefit by having matches that are reliable and that are genetically related to us. The matches that will be lost are for the most part not legitimate matches or genealogically relevant matches. I only have about 30,000 matches at AncestryDNA and with the best will in the world I haven’t got time to follow up on all these matches. In any case I can see from many of the trees that there is no recent genealogical relationship with these matches and it’s a waste of my time to pursue them. I’d much rather see Ancestry expand their database particularly in Europe and bring in lots of new users.

I hope I've now responded to all the comments where required.

Unknown said...

If i color code my less than 8 matches now, but remove the dot at a later date, will that match disappear. In other words, will nightly or weekly updates to the Ancestry system constantly remove these small matches if there isn't a dot? Or is it just a one-time removal this August?

Robin in Short Pump said...

That’s a good question. How about, if you need to remove the dot, you replace it with a note?

Unknown said...

Thanks for your time and energy and wisdom replying to all the comments - and writing the original post.

Debbie Kennett said...

I'm afraid I don't know if a match will disappear if a dot is removed. I would advise writing a note to remind yourself why you saved the match in the first place and that will guarantee that it won't disappear.

Thank you unknown for your kind words. It's nice to be appreciated!

Unknown said...

Could somebody explain how you make a Note or a Group in Ancestry. I always ignored them because there was no content, not realising I was supposed to be making the content.

Unknown said...

To Unknown on 31 July at 17:40. To make a group, from either your list of of DNA matches, or on an individual match, click "Add to Group." The first time you will have to create a custom group. To make a note, on the individual match, click "Add note."

Talk in the country said...

For someone like me the lower matches have been invaluable. I have 2 close male relative (cousin and 1/2 brother) who have tested their DNA. Then we all have 4 more generations on the paternal side with single children (our direct ancestors). Without the smaller cMs we would be dead in the water to help us find out more connections. Without them we also couldn't have solved an NPE.