DNA Coverage with Multiple Kits
How much of an ancestor’s or relative’s DNA can you reproduce when you combine kits of multiple testers?
I wrote the original article on DNA coverage from multiple kits a few years ago. I included tables that showed the amount of an ancestor’s autosomal DNA you could reproduce if you combined kits from multiple relatives. In those tables were exact averages and approximate ranges. The averages can often be calculated in your head, but ranges of reproduced DNA can’t be calculated.
My simulations have gone through several updates since that first article. It was originally a simple marble model. I didn’t trust the ranges very much at that time. I also wanted to obtain standard deviations for paternal and maternal relatives so I could show the true differences in ranges between the two. It took a long time for anyone to publish those. By then I had developed an X model and then used some of that code to make an autosomal model with two chromosome copies. When the standard deviations of Veller et al. (2019 and 2020) were published, I started treating paternal and maternal recombination differently in my new model.
The results of this model are the only ones that are trained on sex-specific standard deviations that are available in peer-reviewed literature. And my tables of shared DNA are the only ones that acknowledge the difference in paternal and maternal shared DNA ranges. Now I’ve finally updated the project that was once my most popular among my readers. I’ve only simulated DNA coverage for parents so far, but will eventually add in combinations of other relatives as I did before.
These values mostly agree with the very simple marble model that I originally used to calculate DNA coverage. The main difference I see is that it no longer appears possible to reproduce a parent’s entire genome with five children testing, whereas the marble model had found that about 2.5% of five-children families could do so. Rather, given that there probably aren’t hundreds of thousands of families that have five or six children tested, the 99% confidence interval is the best one to use for this question. I think it’s safe to say that there are no families of five or fewer children who have all had their DNA genotyped and managed to reproduce all of a parent’s SNPs, since it doesn’t appear to happen once out of 500,000 families. However, there may be some families of six who have all of their father’s SNPs covered.
I hope you’ve found these results useful. More will be on the way. If you want to see more results now, you could check the ones with less accurate ranges in the original article. Or you can calculate results yourself with the online calculator made from that simpler model.
Cover photo by Sharon McCutcheon. Feel free to ask me about modeling & simulation, genetic genealogy, or genealogical research. And make sure to check out these ranges of shared DNA percentages or shared centiMorgans, which are the only published values that match peer-reviewed standard deviations. That model was also used to make a very accurate relationship prediction tool. Or, try a calculator that lets you find the amount of an ancestor’s DNA you have when combining multiple kits.
Originally published at http://www.dna-sci.com.