Relationship predictions are now updated to include differences in maternal and paternal recombination rates as well as validation of ranges by peer-reviewed standard deviations

The calculator can be found here.

I’ve previously published exact averages and very accurate ranges of shared DNA for any genealogical relationship that can be imagined. The model that produces these results is validated by the standard deviations of Veller et al. (2019 & 2020). Since the data that come out of this model are so accurate, and since they can be calculated for sex-specific genealogical relationships, which had never been done before, it seemed only natural to use it for a relationship probability calculator.

Probability curves for different relationship types

The most striking thing about the figures shown here…


How much of an ancestor’s or relative’s DNA can you reproduce when you combine kits of multiple testers?

I wrote the original article on DNA coverage from multiple kits a few years ago. I included tables that showed the amount of an ancestor’s autosomal DNA you could reproduce if you combined kits from multiple relatives. In those tables were exact averages and approximate ranges. The averages can often be calculated in your head, but ranges of reproduced DNA can’t be calculated.

My simulations have gone through several updates since that first article. It was originally a simple marble model. I didn’t trust the ranges very much at that time. I also wanted to obtain standard deviations for paternal…


Alternate title/misleading answer: 23andMe counts FIR twice

Scientists who aren’t familiar with genetic genealogy will be very confused by this question. After all, 50% is the expected amount, or average, that two siblings share. People asking this question have likely been trapped inside of an AncestryDNA bubble.

AncestryDNA doesn’t report the amount of fully-identical regions (FIR) that two people share with each other. To avoid confusion, I’ll note that they do count and use FIR in identifying and labeling full-siblings. And here’s one thing that some people won’t believe when you say it: AncestryDNA counts FIR as half-identical regions (HIR). They’ll riposte that AncestryDNA ignores FIR, but…


People often ask a question similar to this one: If my ethnicity report is showing 5% Iberian, how far back should I expect to find an ancestor of mine who was Iberian? There’s a mathematical way to estimate the answer to that question. I’ll discuss that below and show some of the results I’ve calculated.

The first thing to note, because people will definitely point it out if I don’t, is that all of that Iberian didn’t necessarily come from just one ancestor. In that case, you might have a particular ancestor n generations back who was 75% Iberian and…


People often ask how their DNA match with a cousin will be inflated if they have a double relationship with that cousin. Some helpful people have offered that you can look up shared averages and ranges for certain relationships and double all of them, or for two different types of relationship you can add the different averages and ranges together. I’ve been curious about how accurate this method is, so I’ve finally decided to test the idea in a definitive way.

Some examples of double relationships are double first cousins, in which case two people share the same four grandparents…


The amount of centiMorgans (cM) we share with a relative tells us how closely related we are to them. A cM is the probability, in percentage, that a crossover will occur between two genetic loci during a given meiosis event. The units of cM are right there in the definition: it’s a probability, in percentage. However, if you’re a student of genetics you’ll know, or if you do a quick internet search for “centiMorgans” you’ll see, that almost everyone refers to it as a length or a distance. That’s because the purpose of a cM is to quantify how far…


Using very simple math the get to most out of multiple kits

Many genetic genealogy enthusiasts have their own DNA genotyped as well as some of their siblings. As the enthusiast of your family, you might have access to all of these kits. Or, if they’re all on GEDmatch.com, then you can use these kits in tools whether or not you’re the manager. One thing to definitely take advantage of at that site is the Lazarus tool. However, people often find that not enough of the necessary relatives have their DNA genotyped in order to successfully make a Lazarus kit.

There’s another way to use multiple kits to great effect, and all…


The only cousin statistics that acknowledge the differences in paternal and maternal relatives due to recombination rates.

One million input Poisson rates, normalized, for maternal and paternal recombination.

The average recombination rate in mothers is about 42. Conversely, genomes in fathers only recombine about 27 times, on average. This leads to a conclusion that’s intuitive to geneticists: More recombination decreases variance, leading to narrower ranges in shared DNA for maternal relatives. Less recombination results in more variance, which is why fully or predominantly paternal relatives can share a much wider range of DNA. This phenomenon has been blogged about by Graham Coop.

I’ve developed an autosomal DNA model. It doesn’t rely on any mathematical tricks to, for example, take bad data and then stretch, compress, or otherwise manipulate…


A comparison of three models, including updated amount of shared DNA between various relatives and ancestors

Genetic Modeling Background

I recently published a comparison of autosomal genetic models. The first model, which I made about two years ago, is as simple as can be, but I think it captures the important insights. The new model adds the feature of two homologues per chromosome that you’d find if you peered into the genome of a real human. This allows the simulation of ‘genes’ or ‘segments’ switching places from one homologue to another, potentially multiple times. A constraint on both models is that siblings have to share 50% of their DNA, on average, but with a standard deviation…


Which Model Is Better, a Simpler One or a More Complicated One?

Genetic Modeling Background

A couple of years ago I made my first model of genetic inheritance. It was probably about as simple as such a model could be. Rather than having 22 separate autosomal chromosomes, it was more like having one long string of connected chromosomes. And rather than having two homologues, or a second copy of each chromosome, it was more like the string of chromosomes was twice as long as it would otherwise be. On top of that, chromosome ‘segments’ wouldn’t stay in any particular place during the model-order wouldn’t matter. I didn’t think about it at the…

Brit Nicholson

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store