Cell-Specific Filters

(1) Remove genotype in cell with quality < X

The genotype quality score represents the Phred-scaled confidence that the genotype assignment (GT) is correct, derived from the genotype normalized likelihoods of possible genotypes (PL). Specifically, the quality score is the difference between the PL of the second most likely genotype, and the PL of the most likely genotype. The values of the PLs are normalized so that the most likely PL is always 0, so the quality score ends up being equal to the second smallest PL, unless that PL is greater than 99. In GATK, the value of quality score is capped at 99 because larger values are not more informative, but they take more space in the file. So if the second most likely PL is greater than 99, we still assign a quality score of 99.

Basically the GQ gives you the difference between the likelihoods of the two most likely genotypes. If it is low, you can tell there is not much confidence in the genotype, i.e. there was not enough evidence to confidently choose one genotype over another.

We recommend using a value of > 30.

(2) Remove genotype in cell with read depth < X

The Read Depth per Variant metric is the filtered depth, at the cell level. This gives the number of filtered reads that support each of the reported alleles.

We recommend using a value of < 10.

(3) Remove genotype in cell with alternate allele freq < X

The Alternate Allele Frequency is the unfiltered allele depth, i.e. the number of reads that support each of the reported alleles. All reads at the position (including reads that did not pass the variant caller’s filters) are included in this number, except reads that were considered uninformative. Reads are considered uninformative when they do not provide enough statistical evidence to support one allele over another. Only non-reference genotype calls are included.

We recommend using a value of < 20 %.