Skip to content

MSI classifier: filter input variant callset by VAF, AD #282

@SeanNesdoly

Description

@SeanNesdoly

Is your feature request related to a problem? Please describe.

I have some tumour samples where filtering by Variant Allele Fraction (VAF) or Allelic Depth (AD) is important for obtaining accurate Tumour Mutational Burden (TMB) and Microsatellite Instability (MSI) calls.

Currently, there exists parameters to filter the input variant callset by VAF or AD:

  • {tumor,control}_{af,dp}_min
  • tmb_{af,dp}_min

However, to my knowledge, none of these filters alter the variant callset used as input to the MSI classifier. As such, when TMB is artificially high due to inclusion of variants with low VAF (e.g., due to subclonality or sequencing artefacts), this impacts:

  • the MSI-based TMB calculations (distinct from PCGR's calculate_tmb())
  • Fraction of INDELs among all calls (#INDELs/(#SNVs+#INDELs))

Both of these features are weighted highly in the MSI classifier.

Describe the solution you'd like

Parameters for minimum VAF and AD should be added to filter the variant callset used as input to the MSI classifier, similar to that of calculate_tmb() in pcgr/variant.py. These filters should also be resolved against the global minimums set by parameters {tumor,control}_{af,dp}_min.

Describe alternatives you've considered

Pre-filtering variant callsets prior to parsing by PCGR is also possible, but this removes variants that could be of interest in other parts of the report.

Additional context

I briefly compared PCGR-based MSI calls against MSIsensor-pro; those called as MSI-high by PCGR are often microsatellite stable (MSS) in MSIsensor-pro.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions